jlesage / docker-video-duplicate-finder

Docker container for Video Duplicate Finder
MIT License
42 stars 1 forks source link

Docker container stops mid-scan, logs indicate an out-of-memory error #5

Open i-am-at0m opened 1 year ago

i-am-at0m commented 1 year ago

Running on an unRAID host with 32GB of RAM, looks like it crashed out mid-scan. The directory I was trying to index is about 5TB. Not sure what logs you need.

Relevant unraid system log: Feb 27 22:34:17 Tower kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/docker/1b2a6af2c5f3e26e241f2189dc1c6c5a15c2885587d1adea3c476efddee75a53,task=VDF.GUI,pid=7337,uid=99 Feb 27 22:34:17 Tower kernel: Out of memory: Killed process 7337 (VDF.GUI) total-vm:306378784kB, anon-rss:30001700kB, file-rss:0kB, shmem-rss:43044kB, UID:99 pgtables:68576kB oom_score_adj:0 Feb 27 22:34:21 Tower kernel: oom_reaper: reaped process 7337 (VDF.GUI), now anon-rss:0kB, file-rss:0kB, shmem-rss:43048kB Feb 27 22:34:22 Tower kernel: docker0: port 1(veth6af0f5f) entered disabled state

Can't find the logs from the container, as they scrolled when I restarted it and I don't know how to retrieve them. Working on running it again tonight and will be able to update in the morning if it crashes again (and it probably will)

jlesage commented 1 year ago

The directory I was trying to index is about 5TB

How much files this represents ?

Feb 27 22:34:17 Tower kernel: Out of memory: Killed process 7337 (VDF.GUI) total-vm:306378784kB, anon-rss:30001700kB, file-rss:0kB, shmem-rss:43044kB, UID:99 pgtables:68576kB oom_score_adj:0

This indicates that the host (unRAID) killed VDF.GUI (the Video Duplicate Finder main process) because your system was out of memory.

Can't find the logs from the container, as they scrolled when I restarted it and I don't know how to retrieve them.

You can retrieve all logs with docker logs VideoDuplicateFinder.

However, I don't think we will get something interesting from the logs, since the container has been killed by the host. Do you have a lot of free memory before starting the container ?

jlesage commented 1 year ago

You can also check the following issue from the original project, it has interesting information:

https://github.com/0x90d/videoduplicatefinder/issues/273

i-am-at0m commented 1 year ago

The directory I was trying to index is about 5TB

How much files this represents ?

400,898

Feb 27 22:34:17 Tower kernel: Out of memory: Killed process 7337 (VDF.GUI) total-vm:306378784kB, anon-rss:30001700kB, file-rss:0kB, shmem-rss:43044kB, UID:99 pgtables:68576kB oom_score_adj:0

This indicates that the host (unRAID) killed VDF.GUI (the Video Duplicate Finder main process) because your system was out of memory.

Yeah, that's what I figured.

Can't find the logs from the container, as they scrolled when I restarted it and I don't know how to retrieve them.

You can retrieve all logs with docker logs VideoDuplicateFinder.

However, I don't think we will get something interesting from the logs, since the container has been killed by the host. Do you have a lot of free memory before starting the container ?

The host only has the capability to support up to 32GB, so that's what I've got mounted. unRAID reports its 90% available before starting the scan, so 30GB or so available, which tracks with the system log error message that killed it when it took up more than 30GB.

The docker logs aren't super useful, as you said (snipped this out of the middle of a bunch of VNC-related stuff). [supervisor ] service 'app' exited (got signal SIGKILL). [supervisor ] service 'app' exited, shutting down... [supervisor ] stopping service 'openbox'... [supervisor ] service 'openbox' exited (with status 0). [supervisor ] stopping service 'nginx'...

You can also check the following issue from the original project, it has interesting information:

0x90d/videoduplicatefinder#273

That's not super useful, but there are some mitigating steps there. I don't understand why that developer won't tweak their software to throw warnings and handle issues like that gracefully; it's not like the software doesn't know how many files are being scanned, it traverses the file tree right at the beginning of the scan.

I noticed a pattern in the issues for that that the developer basically won't address anything that's being caused by the libraries they're using, which is frustrating for their users.

That said, unRAID by default doesn't have a swapfile, so I'm trying out a plugin that adds and manages one to see if that helps, adding another 32GB of swap space, and turned off thumbnail creation as suggested in that support thread and am running another scan. This time it estimates it'll be done in 2 hours as opposed to 8, so we'll see. If this doesn't work, I'll run it all through DupeGuru first and see if that pares down the file count before trying this tool again (which can detect when files are the same but different encodings/resolutions, which DupeGuru can't do).

i-am-at0m commented 1 year ago

Adding the swap file plugin didn't seem to make a difference. Crashed out after install, rebooted to make sure the plugin was running at startup, and it crashed again. I'll see what I can do with other tools I guess.