Docker - All data and config have been reset

dgtlmoon / changedetection.io

The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply monitor which websites had a text change for free. Free Open source web page change detection, Website defacement monitoring, Price change notification

https://changedetection.io

Apache License 2.0

17.32k stars 965 forks source link

Docker - All data and config have been reset #286

Closed sqsuica closed 2 years ago

sqsuica commented 2 years ago

Hi,

I have installed change detection in a Synology Docker and it has been working fine for a while. Then all URLs and config have been reset two days ago. Does anyone have the same issue?

dgtlmoon commented 2 years ago

I had the same thing :(

dgtlmoon commented 2 years ago

How many URLs did you have? I had about 350 or so, I noticed the CPU usage was getting higher

sqsuica commented 2 years ago

I have only 2.

ConorGrocock commented 2 years ago

I've had the same issue at least twice. Both times I had 4 urls

davralin commented 2 years ago

Same here, running with kubernetes - I thought the issue was with my kinda-unstable homelab-cluster, but maybe not.

I can mention that my lab is unstable due to frequent OOMs, so maybe that's an issue?

dgtlmoon commented 2 years ago

yup I think OOM is the killer here, thats a big problem with using a custom JSON file writer so might be better to move to sqlite or something else, unsure just yet

davralin commented 2 years ago

Could it perhaps be separated, at least?

Settings to one file, and cache/diffs to another file? I'm thinking the settings-file wouldn't receive so many writes..

I'd rather get a warning too much about a change being detected, vs not getting anything because alerting and the configured sites have been wiped.

ntmmfts commented 2 years ago

Hi, I haven't seen this issue at all using my current pip installation, and I also never had it happen in docker with persistent storage configured. This was the run command I was using to mount persistent storage:

docker run -d --restart always -p "0.0.0.0:5000:5000" -e WEBDRIVER_URL="http://10.10.10.175:4444" -v /home/docker/changedetection.io/datastore:/datastore --name changedetection.io ghcr.io/dgtlmoon/changedetection.io

And of course you can eliminate the webdriver env setting if you're not using it.

Hope that helps. ~ntmmfts

mentalstring commented 2 years ago

We were bitten by this today... It might have been a OOM too, but can't be sure.

More oddly though, I'm having trouble restoring the data from a filesystem backup. We use a docker volume for changedetection-data and when I restore data from the snapshot into it and start the container, the content of the url-watches.json always get overwritten with the sample one. The docker logs always start with

changedetection.io    | Creating JSON store at /datastore
changedetection.io    | Saving..
changedetection.io    | Saving..
changedetection.io    | Saving..
changedetection.io    | Saving..
changedetection.io    | Saving..

@dgtlmoon Is there a special procedure to recover from a filesystem snapshot?

mentalstring commented 2 years ago

After looking closely into the url-watches.json I had from a backup, the file was corrupted (incomplete). Restoring one from an older backup made it work again.

If the configuration file can't be loaded might be worth logging it — or even aborting —, instead of overwriting what's left with the default settings.

Perhaps to make this the save step more atomic (or more resilient to OOM, if that's the issue), could it perhaps be saved into a temporary file (eg: url-watches.json.tmp) and then that be moved into the right one?

dgtlmoon commented 2 years ago

@mentalstring

Perhaps to make this the save step more atomic (or more resilient to OOM, if that's the issue), could it perhaps be saved into a temporary file (eg: url-watches.json.tmp) and then that be moved into the right one?

yeah this is a great idea, problem at the moment is that i'm walking the line between writing an ORM/Database and a "simple JSON store"

How to test this? maybe set a limit then build a huge JSON array?

https://stackoverflow.com/questions/16779497/how-to-set-memory-limit-for-thread-or-process-in-python

dgtlmoon commented 2 years ago

I've got a new PR there, can anyone come up with some devious ways to test it? can anyone test it ? :)

mentalstring commented 2 years ago

I'm not a python dev, so I can't help/comment the code itself, sorry.

For what's worth, theurl-watches.json file I got out of the backup was truncated at about 1/3 of the full content. I can't be sure whether it was a OOM or the backup processing running while the JSON was being written to storage. Regardless, a temp file&move should address both cases.

dgtlmoon commented 2 years ago

@mentalstring please test https://github.com/dgtlmoon/changedetection.io/tree/ticket-286-lost-data for me, please try to put it on your machine and see if the problem can be reproduced (or if it introduces some new issue) I cannot be the only person here testing and writing everything, I need some help, I'm writing this software for free in my spare time when I should be working more to cover to my expenses, or please donate via BTC

dgtlmoon commented 2 years ago

Looks like we could use a special testing docker-compose.yml with limited memory and disk

Starting to think this issue is more about disk space than OOM, cause writing to disk shouldnt take up more RAM, only disk

sqsuica commented 2 years ago

I haven't had the issue since 10 days ago. However, I do have some issues with diff showing repeatedly the same change that it should be different changes. I can help testing that. But is there a way to reproduce it?

I am not technical but my Synology docker setup doesn't seem to have disk space limitation options that I can see. I can see memory and cpu limits.

dgtlmoon commented 2 years ago

@sqsuica > I do have some issues with diff showing repeatedly the same chang

please open a new issue and attach the .zip backup file

dgtlmoon commented 2 years ago

I set a disk limit..

volumes:
  changedetection-data-test:
    # For details, see:
    # https://docs.docker.com/engine/reference/commandline/volume_create/#driver-specific-options
    driver: local
    driver_opts:
      o: "size=1m"
      device: tmpfs
      type: tmpfs

On this branch, I filled the disk..

root@changedetection:/datastore# dd if=/dev/random of=/datastore/test.bin bs=100K count=50
dd: error writing '/datastore/test.bin': No space left on device

manually checked a site, I can see it correctly catches the exception when trying to save the snapshot of the site

changedetection.io    | !!!! Exception in update_worker !!!
changedetection.io    |  [Errno 28] No space left on device

I can see..

-rw-r--r-- 1 root root   5667 Dec  2 20:54 url-watches.json
-rw-r--r-- 1 root root      0 Dec  2 20:55 url-watches.json.tmp

And then I can see the actual JSON index writer catching the exception

changedetection.io    | INFO:root:Saving JSON..
changedetection.io    | ERROR:root:Error writing JSON!![Errno 28] No space left on device

I freed up the disk, then I can see the app running fine again

cool.. still dont know how to test for memory limit issues tho, however #292 will definitely help!

dgtlmoon commented 2 years ago

@mentalstring info on restoring a backup (by zip file) is here https://github.com/dgtlmoon/changedetection.io/wiki/Restoring-backup-files

dgtlmoon commented 2 years ago

Leaving this one open - any new reports - be sure to use the new version!

dgtlmoon commented 2 years ago

@davralin

Could it perhaps be separated, at least? Settings to one file, and cache/diffs to another file? I'm thinking the settings-file wouldn't receive so many writes..

yes it already does that.. so if you've lost your diffs, it means something is really wrong with your setup.. as we only write them once

dgtlmoon commented 2 years ago

Closing this one.. the v0.39.4 release should resolve this, if it does not, then it's something else (like you deleted the volume while upgrading, or something)

but in any case, if it happens.. attach logs