sahib / rmlint

Extremely fast tool to remove duplicates and other lint from your filesystem
http://rmlint.rtfd.org
GNU General Public License v3.0
1.89k stars 132 forks source link

-u does not limit memory consumption #309

Closed Blindfreddy closed 5 years ago

Blindfreddy commented 5 years ago

Firstly, rmlint is AWESOME, thanks, I'd been looking for a great dedup tool like this for years !!!

But, I either don't understand the -u switch or it doesn't work properly. No matter what I specify for -u, all system memory is consumed and the oom_reaper kills rmlint after a while.

Running on an odroid hc1, rmlint spawns several processes (seems one per cpu core) and each of them uses much more memory than the -u limit. Changing the -u value does not seem to have any effect. Memory consumption grows over time.

Here is the htop output to demonstrate

root@odhc1:/etc/monit# htop

....

Mem[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||897M/1.95G] Tasks: 97, 121 thr; 11 running

PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command 18702 andrev 20 0 595M 437M 4036 S 772. 21.9 54:02.25 rmlint -e -u 128M HomeVideo/ Camcorder/ 19870 andrev 20 0 595M 437M 4036 R 98.7 21.9 0:06.68 rmlint -e -u 128M HomeVideo/ Camcorder/ 19886 andrev 20 0 595M 437M 4036 R 98.7 21.9 0:04.11 rmlint -e -u 128M HomeVideo/ Camcorder/ 19744 andrev 20 0 595M 437M 4036 R 98.1 21.9 0:23.77 rmlint -e -u 128M HomeVideo/ Camcorder/ 19825 andrev 20 0 595M 437M 4036 R 92.4 21.9 0:11.42 rmlint -e -u 128M HomeVideo/ Camcorder/ 19847 andrev 20 0 595M 437M 4036 R 65.8 21.9 0:05.05 rmlint -e -u 128M HomeVideo/ Camcorder/ 19698 andrev 20 0 595M 437M 4036 R 65.2 21.9 0:19.76 rmlint -e -u 128M HomeVideo/ Camcorder/ 19859 andrev 20 0 595M 437M 4036 R 62.7 21.9 0:05.96 rmlint -e -u 128M HomeVideo/ Camcorder/ 19841 andrev 20 0 595M 437M 4036 S 45.6 21.9 0:07.03 rmlint -e -u 128M HomeVideo/ Camcorder/ 19891 andrev 20 0 595M 437M 4036 R 38.6 21.9 0:01.48 rmlint -e -u 128M HomeVideo/ Camcorder/ 19754 andrev 20 0 595M 437M 4036 R 24.7 21.9 0:16.99 rmlint -e -u 128M HomeVideo/ Camcorder/ 18727 andrev 20 0 595M 437M 4036 S 14.6 21.9 0:40.87 rmlint -e -u 128M HomeVideo/ Camcorder/ 18728 andrev 20 0 595M 437M 4036 S 9.5 21.9 0:43.00 rmlint -e -u 128M HomeVideo/ Camcorder/ 19849 root 20 0 9856 4872 4040 S 6.3 0.2 0:00.23 /usr/sbin/sshd -D -R 29002 root 20 0 5080 2092 1320 R 1.9 0.1 18:10.34 htop 2501 plex 20 0 248M 1524 0 S 0.6 0.1 1h57:28 /usr/lib/plexmediaserver/Plex DLNA Server

...

sahib commented 5 years ago

Hello @Blindfreddy,

Firstly, rmlint is AWESOME, thanks, I'd been looking for a great dedup tool like this for years !!!

Good to hear. But we've been also around for a few years now. :smile:

Running on an odroid hc1, rmlint spawns several processes (seems one per cpu core) and each of them uses much more memory than the -u limit. Changing the -u value does not seem to have any effect. Memory consumption grows over time.

This is pretty weird. rmlint definitely only works as a single process, so I don't quite understand where the several processes in your htop are coming from. You're sure it's not showing threads (it has different pids, but just to make sure)?

Regarding -u: This option is sadly not exact science (and will never be). It has the most effect when used together with -p (paranoid mode). There we do not store hashes but actual file contents, which potentially use a lot more memory. For normal operation, this switch does not do very much. The actual cause of your memory issue are the several processes. Any clue why? How do you start rmlint?

Blindfreddy commented 5 years ago

Yeah, not sure how I missed rmlint thus far, maybe because there is no package for raspbian ?

Anyway, there is only one process - not sure why htop reports several:

ps -ef | grep rmlint andrev 31023 3751 0 17:33 pts/0 00:00:00 rmlint -u 50M HomeVideo Camcorder

I just start it from the command line, nothing special.

Could it have to do with the large files ? There are about 40 of them, refer below for a snipped. It seems to work fine if I start rmlint from with subdirectories which do not have these large files.

WARNING: Added big file /srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape12_30V-11VI2007_complete_withTitles.mpeg WARNING: Added big file /srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape19_29VI-26X2008/HV_AAvR_tape19_29VI-26X2008_complete_withTitles.mpeg WARNING: Added big file /srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape21_21III-18IV2009/HV_AAvR_tape21_21III-18IV2009_completeWithTitles.mpeg WARNING: Added big file /srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape9__12VIII-05XII2006_completeWithTitles.mpeg

Blindfreddy commented 5 years ago

I've done some more triaging: At first I couldn't reproduce the error at all, it ran steadily for a couple of hours - at that time plexmediaserver was causing significant load by rescanning the photos library. When that job finished and the system was otherwise idle, rmlint jumped up to 1.75GB memory and oom_reaper killed it. The large memory consumption also seems to have nothing to do with large files, I can reproduce the same behaviour in a subdirectory without large files. To reproduce I run rmlint and point it at two directories with largely identical content. Within 3 minutes memory consumption grows to >1.5GB and I abort or oom_reaper kills it:

rmlint -u 64M HomeVideo/HV_AAvR_tape1_16IX-25IX2005/ Camcorder/HV_AAvR_tape1_16IX-25IX2005/

Duplicate(s):

ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/004.kinofx.dv'
rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/004.kinofx.dv'
ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/003.kinofx.dv'
rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/003.kinofx.dv'
ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/006.kinofx.dv'
rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/006.kinofx.dv'
ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/005.kinofx.dv'
rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/005.kinofx.dv'
ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/001.kinofx.dv'
rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/001.kinofx.dv'
ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.25_14-07-14.mp4'
rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.25_14-07-14.mp4'

WARNING: Received Interrupt, stopping...

==> Early shutdown, probably not all lint was found. ==> Note: Please use the saved script below for removal, not the above output. ==> In total 188 files, whereof 6 are duplicates in 6 groups. ==> This equals 194.82 MB of duplicates which could be removed. ==> Scanning took in total 3m 52.317s. Is that good enough?

Wrote a json file to: /srv/dev-disk-by-label-Seagate_4TB/av/rmlint.json Wrote a sh file to: /srv/dev-disk-by-label-Seagate_4TB/av/rmlint.sh

Here are the directory listings:

andrev@odhc1:/srv/dev-disk-by-label-Seagate_4TB/av$ ll HomeVideo/HV_AAvR_tape1_16IX-25IX2005/
total 263596
-rw------- 1 andrev homevideo 39480000 Aug 30  2009 001.kinofx.dv
-rw------- 1 andrev homevideo 28800000 Aug 30  2009 003.kinofx.dv
-rw------- 1 andrev homevideo 28800000 Aug 30  2009 004.kinofx.dv
-rw------- 1 andrev homevideo 28800000 Aug 30  2009 005.kinofx.dv
-rw------- 1 andrev homevideo 28800000 Aug 30  2009 006.kinofx.dv
-rw------- 1 andrev homevideo 28800000 Aug 30  2009 007.kinofx.dv
-rw------- 1 andrev homevideo 28800000 Aug 30  2009 008.kinofx.dv
-rw------- 1 andrev homevideo 28800000 Aug 30  2009 009.kinofx.dv
-rw------- 1 andrev homevideo 28800000 Aug 30  2009 010.kinofx.dv
drwx------ 2 andrev homevideo    12288 Sep 22 00:53 HV_AAvR_16IX-25IX2005_files
andrev@odhc1:/srv/dev-disk-by-label-Seagate_4TB/av$ ll HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/
total 14141764
-rw------- 1 andrev homevideo    5215147 Mar 17  2011 001.kinofx.mp4
-rw------- 1 andrev homevideo    3368948 Mar 17  2011 003.kinofx.mp4
-rw------- 1 andrev homevideo    2919452 Mar 17  2011 004.kinofx.mp4
-rw------- 1 andrev homevideo    1594791 Mar 17  2011 005.kinofx.mp4
-rw------- 1 andrev homevideo    1879636 Mar 17  2011 006.kinofx.mp4
-rw------- 1 andrev homevideo    1376011 Mar 17  2011 007.kinofx.mp4
-rw------- 1 andrev homevideo    1380579 Mar 17  2011 008.kinofx.mp4
-rw------- 1 andrev homevideo    4196494 Mar 17  2011 009.kinofx.mp4
-rw------- 1 andrev homevideo    1094933 Mar 17  2011 010.kinofx.mp4
-rwx--x--- 1 andrev homevideo       7143 May 18  2009 HV_AAvR_16IX-25IX2005.kino
-rw------- 1 andrev homevideo   39658592 May 18  2009 HV_AAvR_tape1_2005.09.16_14-33-58.avi
-rw------- 1 andrev homevideo    5043673 Mar 17  2011 HV_AAvR_tape1_2005.09.16_14-33-58.mp4
-rw------- 1 andrev homevideo   29217896 May 18  2009 HV_AAvR_tape1_2005.09.17_06-56-29.avi
-rw------- 1 andrev homevideo    1451963 Mar 17  2011 HV_AAvR_tape1_2005.09.17_06-56-29.mp4
-rw------- 1 andrev homevideo  304996280 May 18  2009 HV_AAvR_tape1_2005.09.17_07-14-53.avi
-rw------- 1 andrev homevideo   30245914 Mar 17  2011 HV_AAvR_tape1_2005.09.17_07-14-53.mp4
-rw------- 1 andrev homevideo  209589920 May 18  2009 HV_AAvR_tape1_2005.09.17_07-52-08.avi
-rw------- 1 andrev homevideo   31105048 Mar 17  2011 HV_AAvR_tape1_2005.09.17_07-52-08.mp4
-rw------- 1 andrev homevideo  128104488 May 18  2009 HV_AAvR_tape1_2005.09.17_07-53-22.avi
-rw------- 1 andrev homevideo   10639622 Mar 17  2011 HV_AAvR_tape1_2005.09.17_07-53-22.mp4
-rw------- 1 andrev homevideo  124024216 May 18  2009 HV_AAvR_tape1_2005.09.17_07-54-53.avi
-rw------- 1 andrev homevideo    8151629 Mar 17  2011 HV_AAvR_tape1_2005.09.17_07-54-53.mp4
-rw------- 1 andrev homevideo  178747864 May 18  2009 HV_AAvR_tape1_2005.09.17_07-59-35.avi
-rw------- 1 andrev homevideo   14724761 Mar 17  2011 HV_AAvR_tape1_2005.09.17_07-59-35.mp4
-rw------- 1 andrev homevideo  383601520 May 18  2009 HV_AAvR_tape1_2005.09.17_08-28-17.avi
-rw------- 1 andrev homevideo   35398657 Mar 17  2011 HV_AAvR_tape1_2005.09.17_08-28-17.mp4
-rw------- 1 andrev homevideo  473007480 May 18  2009 HV_AAvR_tape1_2005.09.17_08-43-38.avi
-rw------- 1 andrev homevideo   40853907 Mar 17  2011 HV_AAvR_tape1_2005.09.17_08-43-38.mp4
-rw------- 1 andrev homevideo  192668792 May 18  2009 HV_AAvR_tape1_2005.09.17_12-02-11.avi
-rw------- 1 andrev homevideo    8630113 Mar 17  2011 HV_AAvR_tape1_2005.09.17_12-02-11.mp4
-rw------- 1 andrev homevideo  322877472 May 18  2009 HV_AAvR_tape1_2005.09.17_12-03-24.avi
-rw------- 1 andrev homevideo   14262369 Mar 17  2011 HV_AAvR_tape1_2005.09.17_12-03-24.mp4
-rw------- 1 andrev homevideo  166867072 May 18  2009 HV_AAvR_tape1_2005.09.17_12-06-26.avi
-rw------- 1 andrev homevideo    6902890 Mar 17  2011 HV_AAvR_tape1_2005.09.17_12-06-26.mp4
-rw------- 1 andrev homevideo  243552184 May 18  2009 HV_AAvR_tape1_2005.09.17_12-14-33.avi
-rw------- 1 andrev homevideo   10856049 Mar 17  2011 HV_AAvR_tape1_2005.09.17_12-14-33.mp4
-rw------- 1 andrev homevideo  118743864 May 18  2009 HV_AAvR_tape1_2005.09.17_12-17-14.avi
-rw------- 1 andrev homevideo    5787563 Mar 17  2011 HV_AAvR_tape1_2005.09.17_12-17-14.mp4
-rw------- 1 andrev homevideo  234071552 May 18  2009 HV_AAvR_tape1_2005.09.17_12-18-07.avi
-rw------- 1 andrev homevideo   11788201 Mar 17  2011 HV_AAvR_tape1_2005.09.17_12-18-07.mp4
-rw------- 1 andrev homevideo  173827536 May 18  2009 HV_AAvR_tape1_2005.09.17_12-20-23.avi
-rw------- 1 andrev homevideo    9896396 Mar 17  2011 HV_AAvR_tape1_2005.09.17_12-20-23.mp4
-rw------- 1 andrev homevideo  141785400 May 18  2009 HV_AAvR_tape1_2005.09.18_02-07-35.avi
-rw------- 1 andrev homevideo    9897012 Mar 17  2011 HV_AAvR_tape1_2005.09.18_02-07-35.mp4
-rw------- 1 andrev homevideo   71580720 May 18  2009 HV_AAvR_tape1_2005.09.18_05-13-16.avi
-rw------- 1 andrev homevideo    5270517 Mar 17  2011 HV_AAvR_tape1_2005.09.18_05-13-16.mp4
-rw------- 1 andrev homevideo  302596120 May 18  2009 HV_AAvR_tape1_2005.09.18_10-10-36.avi
-rw------- 1 andrev homevideo   29140063 Mar 17  2011 HV_AAvR_tape1_2005.09.18_10-10-36.mp4
-rw------- 1 andrev homevideo  426804400 May 18  2009 HV_AAvR_tape1_2005.09.18_13-50-39.avi
-rw------- 1 andrev homevideo   21033733 Mar 17  2011 HV_AAvR_tape1_2005.09.18_13-50-39.mp4
-rw------- 1 andrev homevideo  553316016 May 18  2009 HV_AAvR_tape1_2005.09.18_13-53-02.avi
-rw------- 1 andrev homevideo   29048261 Mar 17  2011 HV_AAvR_tape1_2005.09.18_13-53-02.mp4
-rw------- 1 andrev homevideo  147185760 May 18  2009 HV_AAvR_tape1_2005.09.18_13-56-46.avi
-rw------- 1 andrev homevideo    6236011 Mar 17  2011 HV_AAvR_tape1_2005.09.18_13-56-46.mp4
-rw------- 1 andrev homevideo  457646456 May 18  2009 HV_AAvR_tape1_2005.09.18_15-32-07.avi
-rw------- 1 andrev homevideo   21474790 Mar 17  2011 HV_AAvR_tape1_2005.09.18_15-32-07.mp4
-rw------- 1 andrev homevideo  205509648 May 18  2009 HV_AAvR_tape1_2005.09.19_05-56-19.avi
-rw------- 1 andrev homevideo   11446198 Mar 17  2011 HV_AAvR_tape1_2005.09.19_05-56-19.mp4
-rw------- 1 andrev homevideo  181148024 May 18  2009 HV_AAvR_tape1_2005.09.19_06-06-47.avi
-rw------- 1 andrev homevideo    9955789 Mar 17  2011 HV_AAvR_tape1_2005.09.19_06-06-47.mp4
-rw------- 1 andrev homevideo  261793400 May 18  2009 HV_AAvR_tape1_2005.09.19_14-36-53.avi
-rw------- 1 andrev homevideo   24481660 Mar 17  2011 HV_AAvR_tape1_2005.09.19_14-36-53.mp4
-rw------- 1 andrev homevideo  281354704 May 18  2009 HV_AAvR_tape1_2005.09.19_22-58-07.avi
-rw------- 1 andrev homevideo   16857561 Mar 17  2011 HV_AAvR_tape1_2005.09.19_22-58-07.mp4
-rw------- 1 andrev homevideo  161826736 May 18  2009 HV_AAvR_tape1_2005.09.22_03-12-56.avi
-rw------- 1 andrev homevideo    6471264 Mar 17  2011 HV_AAvR_tape1_2005.09.22_03-12-56.mp4
-rw------- 1 andrev homevideo   64620256 May 18  2009 HV_AAvR_tape1_2005.09.22_03-14-33.avi
-rw------- 1 andrev homevideo    1688487 Mar 17  2011 HV_AAvR_tape1_2005.09.22_03-14-33.mp4
-rw------- 1 andrev homevideo  107823136 May 18  2009 HV_AAvR_tape1_2005.09.22_03-14-55.avi
-rw------- 1 andrev homevideo    4279832 Mar 17  2011 HV_AAvR_tape1_2005.09.22_03-14-55.mp4
-rw------- 1 andrev homevideo  188828536 May 18  2009 HV_AAvR_tape1_2005.09.22_15-27-59.avi
-rw------- 1 andrev homevideo    6739064 Mar 17  2011 HV_AAvR_tape1_2005.09.22_15-27-59.mp4
-rw------- 1 andrev homevideo  149345904 May 18  2009 HV_AAvR_tape1_2005.09.23_04-04-44.avi
-rw------- 1 andrev homevideo    6822252 Mar 17  2011 HV_AAvR_tape1_2005.09.23_04-04-44.mp4
-rw------- 1 andrev homevideo  182228096 May 18  2009 HV_AAvR_tape1_2005.09.23_04-05-46.avi
-rw------- 1 andrev homevideo    7142499 Mar 17  2011 HV_AAvR_tape1_2005.09.23_04-05-46.mp4
-rw------- 1 andrev homevideo   93542184 May 18  2009 HV_AAvR_tape1_2005.09.23_04-11-08.avi
-rw------- 1 andrev homevideo    3926618 Mar 17  2011 HV_AAvR_tape1_2005.09.23_04-11-08.mp4
-rw------- 1 andrev homevideo   73620856 May 18  2009 HV_AAvR_tape1_2005.09.23_04-12-22.avi
-rw------- 1 andrev homevideo    3437183 Mar 17  2011 HV_AAvR_tape1_2005.09.23_04-12-22.mp4
-rw------- 1 andrev homevideo  176467712 May 18  2009 HV_AAvR_tape1_2005.09.23_15-43-16.avi
-rw------- 1 andrev homevideo    7343042 Mar 17  2011 HV_AAvR_tape1_2005.09.23_15-43-16.mp4
-rw------- 1 andrev homevideo  259753264 May 18  2009 HV_AAvR_tape1_2005.09.24_17-44-23.avi
-rw------- 1 andrev homevideo   31391042 Mar 17  2011 HV_AAvR_tape1_2005.09.24_17-44-23.mp4
-rw------- 1 andrev homevideo  371360704 May 18  2009 HV_AAvR_tape1_2005.09.24_19-10-47.avi
-rw------- 1 andrev homevideo   24319601 Mar 17  2011 HV_AAvR_tape1_2005.09.24_19-10-47.mp4
-rw------- 1 andrev homevideo   32578120 May 18  2009 HV_AAvR_tape1_2005.09.24_19-12-44.avi
-rw------- 1 andrev homevideo    1839234 Mar 17  2011 HV_AAvR_tape1_2005.09.24_19-12-44.mp4
-rw------- 1 andrev homevideo  157386440 May 18  2009 HV_AAvR_tape1_2005.09.24_19-30-14.avi
-rw------- 1 andrev homevideo    9667439 Mar 17  2011 HV_AAvR_tape1_2005.09.24_19-30-14.mp4
-rw------- 1 andrev homevideo  284594920 May 18  2009 HV_AAvR_tape1_2005.09.24_19-46-44.avi
-rw------- 1 andrev homevideo   17558541 Mar 17  2011 HV_AAvR_tape1_2005.09.24_19-46-44.mp4
-rw------- 1 andrev homevideo  353959544 May 18  2009 HV_AAvR_tape1_2005.09.24_19-54-03.avi
-rw------- 1 andrev homevideo   23731023 Mar 17  2011 HV_AAvR_tape1_2005.09.24_19-54-03.mp4
-rw------- 1 andrev homevideo   83461512 May 18  2009 HV_AAvR_tape1_2005.09.24_20-05-35.avi
-rw------- 1 andrev homevideo    4833694 Mar 17  2011 HV_AAvR_tape1_2005.09.24_20-05-35.mp4
-rw------- 1 andrev homevideo   48179160 May 18  2009 HV_AAvR_tape1_2005.09.24_20-09-03.avi
-rw------- 1 andrev homevideo    2935048 Mar 17  2011 HV_AAvR_tape1_2005.09.24_20-09-03.mp4
-rw------- 1 andrev homevideo  115023616 May 18  2009 HV_AAvR_tape1_2005.09.25_07-45-45.avi
-rw------- 1 andrev homevideo    5546529 Mar 17  2011 HV_AAvR_tape1_2005.09.25_07-45-45.mp4
-rw------- 1 andrev homevideo   70380640 May 18  2009 HV_AAvR_tape1_2005.09.25_11-50-00.avi
-rw------- 1 andrev homevideo    6646758 Mar 17  2011 HV_AAvR_tape1_2005.09.25_11-50-00.mp4
-rw------- 1 andrev homevideo  113943544 May 18  2009 HV_AAvR_tape1_2005.09.25_11-50-41.avi
-rw------- 1 andrev homevideo    8622762 Mar 17  2011 HV_AAvR_tape1_2005.09.25_11-50-41.mp4
-rw------- 1 andrev homevideo  150665992 May 18  2009 HV_AAvR_tape1_2005.09.25_11-58-13.avi
-rw------- 1 andrev homevideo   12072661 Mar 17  2011 HV_AAvR_tape1_2005.09.25_11-58-13.mp4
-rw------- 1 andrev homevideo   81421376 May 18  2009 HV_AAvR_tape1_2005.09.25_12-00-03.avi
-rw------- 1 andrev homevideo    5884723 Mar 17  2011 HV_AAvR_tape1_2005.09.25_12-00-03.mp4
-rw------- 1 andrev homevideo  571797248 May 18  2009 HV_AAvR_tape1_2005.09.25_12-00-51.avi
-rw------- 1 andrev homevideo   38455862 Mar 17  2011 HV_AAvR_tape1_2005.09.25_12-00-51.mp4
-rw------- 1 andrev homevideo  148745864 May 18  2009 HV_AAvR_tape1_2005.09.25_13-21-13.avi
-rw------- 1 andrev homevideo   14229469 Mar 17  2011 HV_AAvR_tape1_2005.09.25_13-21-13.mp4
-rw------- 1 andrev homevideo  137825136 May 18  2009 HV_AAvR_tape1_2005.09.25_13-23-16.avi
-rw------- 1 andrev homevideo   15609552 Mar 17  2011 HV_AAvR_tape1_2005.09.25_13-23-16.mp4
-rw------- 1 andrev homevideo 1714440256 May 18  2009 HV_AAvR_tape1_2005.09.25_13-56-47.avi
-rw------- 1 andrev homevideo  152027789 Mar 17  2011 HV_AAvR_tape1_2005.09.25_13-56-47.mp4
-rw------- 1 andrev homevideo  499552432 May 18  2009 HV_AAvR_tape1_2005.09.25_14-04-49.avi
-rw------- 1 andrev homevideo   42478994 Mar 17  2011 HV_AAvR_tape1_2005.09.25_14-04-49.mp4
-rw------- 1 andrev homevideo  764290080 May 18  2009 HV_AAvR_tape1_2005.09.25_14-07-14.avi
-rw------- 1 andrev homevideo   49606051 Mar 17  2011 HV_AAvR_tape1_2005.09.25_14-07-14.mp4
-rwx--x--- 1 andrev homevideo       7823 Aug 30  2009 HV_AAvR_tape1_completeWithTitles_16IX-25IX2005.kino

and the peer directory

andrev@odhc1:/srv/dev-disk-by-label-Seagate_4TB/av$ ll Camcorder/HV_AAvR_tape1_16IX-25IX2005/
total 13236196
-rw-r--r-- 1 andrev users   39480000 Aug 30  2009 001.kinofx.dv
-rw-r--r-- 1 andrev users   28800000 Aug 30  2009 003.kinofx.dv
-rw-r--r-- 1 andrev users   28800000 Aug 30  2009 004.kinofx.dv
-rw-r--r-- 1 andrev users   28800000 Aug 30  2009 005.kinofx.dv
-rw-r--r-- 1 andrev users   28800000 Aug 30  2009 006.kinofx.dv
-rw-r--r-- 1 andrev users   28800000 Aug 30  2009 007.kinofx.dv
-rw-r--r-- 1 andrev users   28800000 Aug 30  2009 008.kinofx.dv
-rw-r--r-- 1 andrev users   28800000 Aug 30  2009 009.kinofx.dv
-rw-r--r-- 1 andrev users   28800000 Aug 30  2009 010.kinofx.dv
-rw-r--r-- 1 andrev users  304996280 May 18  2009 HV_AAvR_tape1_2005.09.17_07-14-53.avi
-rw-r--r-- 1 andrev users  209589920 May 18  2009 HV_AAvR_tape1_2005.09.17_07-52-08.avi
-rw-r--r-- 1 andrev users  128104488 May 18  2009 HV_AAvR_tape1_2005.09.17_07-53-22.avi
-rw-r--r-- 1 andrev users  124024216 May 18  2009 HV_AAvR_tape1_2005.09.17_07-54-53.avi
-rw-r--r-- 1 andrev users  178747864 May 18  2009 HV_AAvR_tape1_2005.09.17_07-59-35.avi
-rw-r--r-- 1 andrev users  383601520 May 18  2009 HV_AAvR_tape1_2005.09.17_08-28-17.avi
-rw-r--r-- 1 andrev users  473007480 May 18  2009 HV_AAvR_tape1_2005.09.17_08-43-38.avi
-rw-r--r-- 1 andrev users  192668792 May 18  2009 HV_AAvR_tape1_2005.09.17_12-02-11.avi
-rw-r--r-- 1 andrev users  322877472 May 18  2009 HV_AAvR_tape1_2005.09.17_12-03-24.avi
-rw-r--r-- 1 andrev users  166867072 May 18  2009 HV_AAvR_tape1_2005.09.17_12-06-26.avi
-rw-r--r-- 1 andrev users  243552184 May 18  2009 HV_AAvR_tape1_2005.09.17_12-14-33.avi
-rw-r--r-- 1 andrev users  118743864 May 18  2009 HV_AAvR_tape1_2005.09.17_12-17-14.avi
-rw-r--r-- 1 andrev users  234071552 May 18  2009 HV_AAvR_tape1_2005.09.17_12-18-07.avi
-rw-r--r-- 1 andrev users  173827536 May 18  2009 HV_AAvR_tape1_2005.09.17_12-20-23.avi
-rw-r--r-- 1 andrev users  141785400 May 18  2009 HV_AAvR_tape1_2005.09.18_02-07-35.avi
-rw-r--r-- 1 andrev users  302596120 May 18  2009 HV_AAvR_tape1_2005.09.18_10-10-36.avi
-rw-r--r-- 1 andrev users  426804400 May 18  2009 HV_AAvR_tape1_2005.09.18_13-50-39.avi
-rw-r--r-- 1 andrev users  553316016 May 18  2009 HV_AAvR_tape1_2005.09.18_13-53-02.avi
-rw-r--r-- 1 andrev users  147185760 May 18  2009 HV_AAvR_tape1_2005.09.18_13-56-46.avi
-rw-r--r-- 1 andrev users  457646456 May 18  2009 HV_AAvR_tape1_2005.09.18_15-32-07.avi
-rw-r--r-- 1 andrev users  205509648 May 18  2009 HV_AAvR_tape1_2005.09.19_05-56-19.avi
-rw-r--r-- 1 andrev users  181148024 May 18  2009 HV_AAvR_tape1_2005.09.19_06-06-47.avi
-rw-r--r-- 1 andrev users  261793400 May 18  2009 HV_AAvR_tape1_2005.09.19_14-36-53.avi
-rw-r--r-- 1 andrev users  281354704 May 18  2009 HV_AAvR_tape1_2005.09.19_22-58-07.avi
-rw-r--r-- 1 andrev users  161826736 May 18  2009 HV_AAvR_tape1_2005.09.22_03-12-56.avi
-rw-r--r-- 1 andrev users  107823136 May 18  2009 HV_AAvR_tape1_2005.09.22_03-14-55.avi
-rw-r--r-- 1 andrev users  188828536 May 18  2009 HV_AAvR_tape1_2005.09.22_15-27-59.avi
-rw-r--r-- 1 andrev users  149345904 May 18  2009 HV_AAvR_tape1_2005.09.23_04-04-44.avi
-rw-r--r-- 1 andrev users  182228096 May 18  2009 HV_AAvR_tape1_2005.09.23_04-05-46.avi
-rw-r--r-- 1 andrev users   93542184 May 18  2009 HV_AAvR_tape1_2005.09.23_04-11-08.avi
-rw-r--r-- 1 andrev users  176467712 May 18  2009 HV_AAvR_tape1_2005.09.23_15-43-16.avi
-rw-r--r-- 1 andrev users  259753264 May 18  2009 HV_AAvR_tape1_2005.09.24_17-44-23.avi
-rw-r--r-- 1 andrev users  371360704 May 18  2009 HV_AAvR_tape1_2005.09.24_19-10-47.avi
-rw-r--r-- 1 andrev users  157386440 May 18  2009 HV_AAvR_tape1_2005.09.24_19-30-14.avi
-rw-r--r-- 1 andrev users  284594920 May 18  2009 HV_AAvR_tape1_2005.09.24_19-46-44.avi
-rw-r--r-- 1 andrev users  353959544 May 18  2009 HV_AAvR_tape1_2005.09.24_19-54-03.avi
-rw-r--r-- 1 andrev users   83461512 May 18  2009 HV_AAvR_tape1_2005.09.24_20-05-35.avi
-rw-r--r-- 1 andrev users  115023616 May 18  2009 HV_AAvR_tape1_2005.09.25_07-45-45.avi
-rw-r--r-- 1 andrev users  113943544 May 18  2009 HV_AAvR_tape1_2005.09.25_11-50-41.avi
-rw-r--r-- 1 andrev users  150665992 May 18  2009 HV_AAvR_tape1_2005.09.25_11-58-13.avi
-rw-r--r-- 1 andrev users   81421376 May 18  2009 HV_AAvR_tape1_2005.09.25_12-00-03.avi
-rw-r--r-- 1 andrev users  571797248 May 18  2009 HV_AAvR_tape1_2005.09.25_12-00-51.avi
-rw-r--r-- 1 andrev users  148745864 May 18  2009 HV_AAvR_tape1_2005.09.25_13-21-13.avi
-rw-r--r-- 1 andrev users  137825136 May 18  2009 HV_AAvR_tape1_2005.09.25_13-23-16.avi
-rw-r--r-- 1 andrev users 1714440256 May 18  2009 HV_AAvR_tape1_2005.09.25_13-56-47.avi
-rw-r--r-- 1 andrev users  152027789 Mar 17  2011 HV_AAvR_tape1_2005.09.25_13-56-47.mp4
-rw-r--r-- 1 andrev users  499552432 May 18  2009 HV_AAvR_tape1_2005.09.25_14-04-49.avi
-rw-r--r-- 1 andrev users  764290080 May 18  2009 HV_AAvR_tape1_2005.09.25_14-07-14.avi
-rw-r--r-- 1 andrev users   49606051 Mar 17  2011 HV_AAvR_tape1_2005.09.25_14-07-14.mp4

andrev@odhc1:/srv/dev-disk-by-label-Seagate_4TB/av$ du -sh HomeVideo/HV_AAvR_tape1_16IX-25IX2005/
14G HomeVideo/HV_AAvR_tape1_16IX-25IX2005/
andrev@odhc1:/srv/dev-disk-by-label-Seagate_4TB/av$ du -sh Camcorder/HV_AAvR_tape1_16IX-25IX2005/
13G Camcorder/HV_AAvR_tape1_16IX-25IX2005/

My suspicion is that allocated memory is not released somewhere in the code.

Blindfreddy commented 5 years ago

Some more info: the problem does not occur in paranoia mode, so it seems it's associated with the hashing - maybe in some circumstances memory is not released after reading and hashing a file ?

top - 11:48:22 up 2 days,  2:41,  5 users,  load average: 2.85, 1.91, 10.31
Tasks: 242 total,   1 running, 151 sleeping,   0 stopped,   1 zombie
%Cpu(s):  4.6 us,  3.3 sy,  0.0 ni, 81.6 id, 10.3 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem :  2044844 total,    43520 free,   699964 used,  1301360 buff/cache
KiB Swap:  1022400 total,   595740 free,   426660 used.  1281712 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                       
16250 andrev    20   0  302872 222924   4068 S  53.8 10.9   3:09.29 rmlint -p -u 128M HomeVideo/HV_AAvR_tape1_16IX-25IX2005/ Camcorder/HV_AAvR_tape1_16IX-25IX200+
   64 root      20   0       0      0      0 S   5.6  0.0 186:45.58 [kswapd0]                                                                                     
12921 andrev    20   0    5096   1916   1164 S   2.0  0.1   1:58.33 htop                                                                                          
14681 plex      20   0  423212 101704  19960 S   1.3  5.0 127:01.06 /usr/lib/plexmediaserver/Plex Media Server                                                    

Here is the output:

rmlint -p -u 128M HomeVideo/HV_AAvR_tape1_16IX-25IX2005/ Camcorder/HV_AAvR_tape1_16IX-25IX2005/

# Duplicate(s):
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.25_14-07-14.mp4'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.25_14-07-14.mp4'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.25_13-56-47.mp4'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.25_13-56-47.mp4'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.17_07-52-08.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.17_07-52-08.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.19_06-06-47.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.19_06-06-47.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.17_07-53-22.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.17_07-53-22.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.17_07-14-53.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.17_07-14-53.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.17_07-54-53.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.17_07-54-53.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.18_10-10-36.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.18_10-10-36.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.17_07-59-35.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.17_07-59-35.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.18_13-56-46.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.18_13-56-46.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.17_12-02-11.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.17_12-02-11.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.17_08-28-17.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.17_08-28-17.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.17_12-06-26.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.17_12-06-26.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.19_14-36-53.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.19_14-36-53.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.17_08-43-38.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.17_08-43-38.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.17_12-03-24.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.17_12-03-24.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.17_12-17-14.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.17_12-17-14.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.17_12-14-33.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.17_12-14-33.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.17_12-20-23.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.17_12-20-23.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.18_02-07-35.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.18_02-07-35.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.17_12-18-07.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.17_12-18-07.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.19_05-56-19.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.19_05-56-19.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.18_13-50-39.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.18_13-50-39.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.19_22-58-07.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.19_22-58-07.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.24_19-46-44.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.24_19-46-44.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.18_15-32-07.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.18_15-32-07.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.18_13-53-02.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.18_13-53-02.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.22_03-12-56.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.22_03-12-56.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.22_03-14-55.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.22_03-14-55.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.22_15-27-59.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.22_15-27-59.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.23_04-11-08.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.23_04-11-08.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.23_04-04-44.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.23_04-04-44.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.25_13-23-16.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.25_13-23-16.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.23_04-05-46.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.23_04-05-46.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.23_15-43-16.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.23_15-43-16.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.24_19-30-14.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.24_19-30-14.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.24_17-44-23.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.24_17-44-23.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.24_20-05-35.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.24_20-05-35.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.24_19-10-47.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.24_19-10-47.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.25_07-45-45.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.25_07-45-45.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.24_19-54-03.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.24_19-54-03.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.25_11-50-41.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.25_11-50-41.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.25_11-58-13.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.25_11-58-13.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.25_12-00-03.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.25_12-00-03.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.25_14-04-49.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.25_14-04-49.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/001.kinofx.dv'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/001.kinofx.dv'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.25_13-21-13.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.25_13-21-13.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.25_12-00-51.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.25_12-00-51.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.25_14-07-14.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.25_14-07-14.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_16IX-25IX2005_files/HV_AAvR_tape1_2005.09.25_13-56-47.avi'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/HV_AAvR_tape1_2005.09.25_13-56-47.avi'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/009.kinofx.dv'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/009.kinofx.dv'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/007.kinofx.dv'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/007.kinofx.dv'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/008.kinofx.dv'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/008.kinofx.dv'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/004.kinofx.dv'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/004.kinofx.dv'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/005.kinofx.dv'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/005.kinofx.dv'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/010.kinofx.dv'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/010.kinofx.dv'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/003.kinofx.dv'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/003.kinofx.dv'
    ls '/srv/dev-disk-by-label-Seagate_4TB/av/HomeVideo/HV_AAvR_tape1_16IX-25IX2005/006.kinofx.dv'
    rm '/srv/dev-disk-by-label-Seagate_4TB/av/Camcorder/HV_AAvR_tape1_16IX-25IX2005/006.kinofx.dv'

==> Note: Please use the saved script below for removal, not the above output.
==> In total 188 files, whereof 58 are duplicates in 58 groups.
==> This equals 12.62 GB of duplicates which could be removed.
==> Scanning took in total 6m 16.306s. Is that good enough?

Wrote a json file to: /srv/dev-disk-by-label-Seagate_4TB/av/rmlint.json
Wrote a sh file to: /srv/dev-disk-by-label-Seagate_4TB/av/rmlint.sh

BTW in paranoia mode it also uses less CPU - is that intended ?

Same results when repeating with -P, so hashing with highway256

top - 11:59:35 up 2 days,  2:52,  5 users,  load average: 4.43, 3.29, 6.47
Tasks: 244 total,   1 running, 152 sleeping,   0 stopped,   1 zombie
%Cpu(s): 10.2 us,  2.6 sy,  0.0 ni, 72.7 id, 14.2 wa,  0.0 hi,  0.2 si,  0.0 st
KiB Mem :  2044844 total,    45524 free,   528524 used,  1470796 buff/cache
KiB Swap:  1022400 total,   609052 free,   413348 used.  1453056 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                       
17081 andrev    20   0  108312  25656   2400 S  88.8  1.3   9:15.50 rmlint                                                                                        
   64 root      20   0       0      0      0 S   2.6  0.0 187:09.69 kswapd0                                                                                       
14681 plex      20   0  455980 111948  19936 S   2.6  5.5 127:28.52 Plex Media Serv                                                                               
17686 plex      20   0   93960  39496  26036 D   2.6  1.9   0:03.75 Plex Media Scan               

==> Note: Please use the saved script below for removal, not the above output.
==> In total 188 files, whereof 58 are duplicates in 58 groups.
==> This equals 12.62 GB of duplicates which could be removed.
==> Scanning took in total 7m 27.390s. Is that good enough?

I'd say somewhere in the blake2b hashing section memory is not released - unfortunately I don't have a system with >2GB memory to confirm that it keeps growing....

Blindfreddy commented 5 years ago

More triaging:

Different directory with lots of files and duplicates using blake2b: 1st run killed by oom_reaper 2nd run: 1.5GB memory consumption after 32minutes, manually stopped

==> Early shutdown, probably not all lint was found.
==> Note: Please use the saved script below for removal, not the above output.
==> In total 235921 files, whereof 66065 are duplicates in 46646 groups.
==> This equals 23.32 GB of duplicates which could be removed.
==> 2346 other suspicious item(s) found, which may vary in size.
==> Scanning took in total 32m 22.139s. Is that good enough?

3rd run -P flag set, so using different hash algorithm: ran to completion in less time, refer below for result:

==> Note: Please use the saved script below for removal, not the above output.
==> In total 235921 files, whereof 66066 are duplicates in 46647 groups.
==> This equals 26.77 GB of duplicates which could be removed.
==> 2346 other suspicious item(s) found, which may vary in size.
==> Scanning took in total 22m 33.214s. Is that good enough?

The good news is that -P is a workaround for large memory consumption / out of memory conditions, the bad news is that the blake2b algorithm seems to use LOTS of memory.

Hope this helps to find the culprit in the code and other poor souls who are encountering the same issue.

sahib commented 5 years ago

Thanks, that's very helpful. I'll try to get to it in the next days.

From a very quick look (might be totally wrong) it might seem to be related to 32 bit, I don't seem to have that issue and valgrind doesn't show something supicious. But definitely a serious issue nonetheless.

Blindfreddy commented 5 years ago

Ok, hope you find the bug quickly. fyi I compiled the executable from source. I have a raspberry pi and also a 64bit x86 laptop so I could try to compile the code on them and run it against the same data (I kept a data set ) to confirm / refute that it's platform dependent, if that would help.

sahib commented 5 years ago

That would help. I usually don't have a lot of time under the week, so might take a bit until the fix.

Blindfreddy commented 5 years ago

No problem, take your time, the -P flag seems a good workaround.

sahib commented 5 years ago

Hmm, I have a hard time reproducing the issue.

I've setup a RaPi (32bit and ARM arch) and copied /usr to have a few files to munch upon:

So far, this fits with the intended use cases of each switch.

By the way, what version are you using (rmlint --version)? Also, please post scons config.

Blindfreddy commented 5 years ago

It doesn't occur straight away - have you got enough files to munch on ? The pattern seems that it allocates around 100-200MB at a time, after a while it releases ca. 50MB thereof, then is stready for a short time, then repeats itself until the oom_reaper kills it.

version:

rmlint --version
version 2.7.0 compiled: Sep 21 2018 at [09:54:54] "Toothless Taipan" (rev 017b007e)
compiled with: +mounts +nonstripped +fiemap +sha512 +bigfiles +intl +replay +xattr +btrfs-support

scons output

scons: Reading SConscript files ...
Checking whether the C compiler works... (cached) yes
Checking for git revision... (cached) 017b007e
Checking for pkg-config... (cached) yes
Checking for glib-2.0 >= 2.32... (cached) yes
Checking for gio-unix-2.0... (cached) yes
Checking for blkid... (cached) yes
Checking for json-glib-1.0... (cached) yes
Checking for -std=c11 support...(cached) yes
Checking for cygwin environment...(cached) Linux/odhc1/4.14.69-odroidxu4/#34 SMP PREEMPT Wed Sep 19 12:45:24 CEST 2018/armv7l/(cached) no
Checking whether _mm_crc32_u64 is declared... (cached) no
Checking whether __builtin_cpu_supports is declared... (cached) no
Checking whether blkid_devno_to_wholedisk is declared... (cached) yes
Checking for existence of /sys/block... (cached) yes
Checking for C header file libelf.h... (cached) yes
Checking for C library libelf... (cached) yes
Checking for C type struct fiemap... (cached) yes
Checking for C function getxattr()... (cached) yes
Checking for C function setxattr()... (cached) yes
Checking for C function removexattr()... (cached) yes
Checking size of off_t ... (cached) yes
Checking for C function stat64()... (cached) yes
Checking whether G_CHECKSUM_SHA512 is declared... (cached) yes
Checking for C header file locale.h... (cached) yes
Checking for C header file linux/limits.h... (cached) yes
Checking whether posix_fadvise is declared... (cached) yes
Checking whether faccessat is declared... (cached) yes
Checking whether AT_FDCWD is declared... (cached) yes
Checking for C header file linux/btrfs.h... (cached) yes
Checking for C header file linux/fs.h... (cached) yes
Checking for C header file sys/utsname.h... (cached) yes
Checking for C header file sys/sysmacros.h... (cached) yes
Using compiler optimisation -Os (to change, run scons with O=[0|1|2|3|s|fast])
Running with --jobs=8
Building rmlint
scons: done reading SConscript files.
scons: Building targets ...
build_config_template(["lib/config.h"], ["lib/config.h.in"])
Compiling ==> lib/checksums/highwayhash.c
Compiling ==> lib/checksums/murmur3.c
Compiling ==> lib/checksums/xxhash/xxhash.c
Compiling ==> lib/checksums/blake2/blake2b-ref.c
Compiling ==> lib/checksums/blake2/blake2bp-ref.c
Compiling ==> lib/checksums/blake2/blake2s-ref.c
Compiling ==> lib/checksums/blake2/blake2sp-ref.c
Compiling ==> lib/checksum.c
Compiling ==> lib/pathtricia.c
Compiling ==> lib/shredder.c
Compiling ==> lib/traverse.c
Compiling ==> lib/xattr.c
Compiling ==> lib/session.c
Compiling ==> lib/file.c
Compiling ==> lib/hasher.c
Compiling ==> lib/checksums/metrohash128.c
Compiling ==> lib/treemerge.c
In file included from lib/session.c:30:0:
lib/session.c: In function 'rm_session_dedupe_main':
lib/session.c:302:35: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'gint64 {aka long long int}' [-Wformat=]
                 rm_log_debug_line("Dropping to %lu byte chunks after %lu bytes",
                                   ^
lib/config.h:65:40: note: in definition of macro 'rm_log_debug'
     g_log("rmlint", G_LOG_LEVEL_DEBUG, __VA_ARGS__)
                                        ^~~~~~~~~~~
lib/session.c:302:17: note: in expansion of macro 'rm_log_debug_line'
                 rm_log_debug_line("Dropping to %lu byte chunks after %lu bytes",
                 ^~~~~~~~~~~~~~~~~
lib/session.c:302:35: warning: format '%lu' expects argument of type 'long unsigned int', but argument 5 has type 'gint64 {aka long long int}' [-Wformat=]
                 rm_log_debug_line("Dropping to %lu byte chunks after %lu bytes",
                                   ^
lib/config.h:65:40: note: in definition of macro 'rm_log_debug'
     g_log("rmlint", G_LOG_LEVEL_DEBUG, __VA_ARGS__)
                                        ^~~~~~~~~~~
lib/session.c:302:17: note: in expansion of macro 'rm_log_debug_line'
                 rm_log_debug_line("Dropping to %lu byte chunks after %lu bytes",
                 ^~~~~~~~~~~~~~~~~
lib/session.c:321:28: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'gint64 {aka long long int}' [-Wformat=]
         rm_log_info_line(_("Only first %lu bytes deduped - files not fully identical"),
                            ^
lib/config.h:69:42: note: in definition of macro 'rm_log_warning'
     g_log("rmlint", G_LOG_LEVEL_WARNING, __VA_ARGS__)
                                          ^~~~~~~~~~~
lib/session.c:321:9: note: in expansion of macro 'rm_log_info_line'
         rm_log_info_line(_("Only first %lu bytes deduped - files not fully identical"),
         ^~~~~~~~~~~~~~~~
lib/session.c:321:26: note: in expansion of macro '_'
         rm_log_info_line(_("Only first %lu bytes deduped - files not fully identical"),
                          ^
Compiling ==> lib/preprocess.c
Compiling ==> lib/formats.c
Compiling ==> src/rmlint.c
Compiling ==> lib/md-scheduler.c
lib/session.c: At top level:
cc1: warning: unrecognized command line option '-Wno-implicit-fallthrough'
Compiling ==> lib/cfg.c
Compiling ==> lib/cmdline.c
Compiling ==> lib/utilities.c
Compiling ==> lib/replay.c
Compiling ==> lib/hash-utility.c
Compiling ==> lib/checksums/blake2/blake2xb-ref.c
Compiling ==> lib/checksums/blake2/blake2xs-ref.c
Compiling ==> lib/checksums/sha3/sha3.c
Compiling ==> lib/formats/_equal.c
Compiling ==> lib/formats/csv.c
Compiling ==> lib/formats/fdupes.c
Compiling ==> lib/formats/json.c
Compiling ==> lib/formats/null.c
Compiling ==> lib/formats/pretty.c
Compiling ==> lib/formats/progressbar.c
build_python_formatter(["lib/formats/py.c"], ["lib/formats/py.c.in"])
Compiling ==> lib/formats/py.c
build_sh_formatter(["lib/formats/sh.c"], ["lib/formats/sh.c.in"])
Compiling ==> lib/formats/stats.c
Compiling ==> lib/formats/summary.c
Compiling ==> lib/formats/timestamp.c
Compiling ==> lib/formats/uniques.c
Compiling ==> lib/fts/fts.c
scons: `docs/_build/man/rmlint.1' is up to date.
scons: `docs/rmlint.1.gz' is up to date.
Compiling ==> lib/formats/sh.c
Linking Static Library ==> librmlint.a
Ranlib Library ==> librmlint.a
Linking Program ==> rmlint
scons: done building targets.
sahib commented 5 years ago

It doesn't occur straight away - have you got enough files to munch on ? The pattern seems that it allocates around 100-200MB at a time, after a while it releases ca. 50MB thereof, then is stready for a short time, then repeats itself until the oom_reaper kills it.

Probably not enough. I'll do some more testing.

scons output

I actually wanted the output of scons config :smirk:

sahib commented 5 years ago

I now completed a run with a bigger dataset and the memory consumption was never bigger than > 200MB with blake2b. Maybe I still need a more heterogeneous data set.

In the meantime I have a suspect: Can you run your test again in the following way:

G_DEBUG=all G_SLICE=always-malloc rmlint [your_options_here]

Also, from your --version output: You're not using the latest develop which is advisable for bug-hunting. I don't think it makes a huge difference here, but it would be nice if you could update.

Blindfreddy commented 5 years ago

So running it that way immediately returns

Trace/breakpoint trap

sahib commented 5 years ago

So running it that way immediately returns Trace/breakpoint trap

Okay, that's pretty weird. Does it do the same when you run only this?

G_SLICE=always-malloc rmlint [your_options_here]

If that yields the same result, can you run with sha3? It uses a very similar memory allocation as blake2b:

rmlint -a sha3-256 [your_options_here_without_minus_a]
Blindfreddy commented 5 years ago

G_SLICE=always-malloc rmlint runs to completion without errors. Memory consumption peaks at 1.65GB with the file set I have - that's just below the limit where the oom reaper becomes active. I could add some more files to cause it to be killed by oom reaper if need be.

sahib commented 5 years ago

No need. I guess 1.65GB is not what the run with highway256 was taking (which is something around 25MB judging from your output?). The G_SLICE flag was supposed to overwrite the special memory allocator that rmlint uses to allocate the many checksums it needs. Since it yielded a very similar result we can probably forget about this as culprit.

The results you're getting are still very confusing to me though. On all systems I tried, it works fine. Probably means I'm missing something. It would be helpful if you could try out a few other checksums and see if they also have this problem:

rmlint -a sha3-256 [your_options_here_without_minus_a]

Thanks for your help.

Blindfreddy commented 5 years ago

Maybe something strange with my build ? With sha3-256 it ran to completion using 40% of 2GB memory. Perhaps it might be easier if I set up an account on my machine ? It's reachable from the internet via ssh. I could copy the files to your home directory then you could run the tests yourself.

sahib commented 5 years ago

This would be indeed very helpful. Here is my public ssh key:

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQChf6YvBhcTkLDNYil0zR9ojRD22ZUPOhNv06lS/mNjB8lJaB3R0TQmibkaqkeLn5WIAipMiJKGq7P2N25VTmvbnPbwAAxuktu7Pfrglujmj9Kk3M7I+fKAXKJvWdh5/OSfNj3vM12OpbehvB0jsCi4atwpx7csN9kZ8+SmV1QppwoRoRqKaq99t0YuDWRzfHOPEH9s7PzUstTpy1oQbJM6Ih3NTaC8H74Y2oebgSV90B61fVe/yoAmNdRfJl1iZ9LvdKsRQ2fa9J6prCyh5JXVBQ9dZk+WSWdiFgFmZQ2NrKy30s/4yToGoOL4ITzdkZCCBslUH3rcQWIEv2W8Iak/ sahib@online.de
Blindfreddy commented 5 years ago

Ok so I have created the user chrispahl on blindfreddy.no-ip.biz and put your public key in the authorized_keys file, please try to log in. To munch on some files please use the directory /srv/dev-disk-by-label-Seagate_4TB/rmlint, where you'll find two folders, Camcorder and tmp.
Then run rmlint -u 64M Camcorder tmp and you should be able to observe the large memory consumption.

sahib commented 5 years ago

Hi @Blindfreddy, the login worked perfectly. Thank you. I have one request: Can you install the valgrind memory profiler and gdb debugger for me?

Blindfreddy commented 5 years ago

Ok great. valgrind and gdb are now installed.

sahib commented 5 years ago

After some misguided debugging in my vacations I finally have a somewhat plausible explanation of what happens.

Your systems triggers an edge case where the filesystem reads in data faster than your CPU can hash it. Normally this is no problem, since most users can survive such short time memory peaks. Your system has low memory, no swap and is also arm which has abysmal performance when it comes to computing blake2 hashes.

This also explains why highway256 works for you: It's just faster and hashes buffers fast enough so they do not build up a backlog. rmlint should be changed to block when the backlog is getting too big. This is more of a design change though and might take a bit of time. In the mean time you can do the following:

Edit: This also means that this issue is not related to arm itself. Could happen anywhere where a slow CPU is combined with fast I/O.

Blindfreddy commented 5 years ago

Ok that makes sense. I'll try out the --buffered-read next time - unless a new version with the fix is released first :-) .

Many thanks for finding the fault and hopefully this thread helps someone else.

sahib commented 5 years ago

I actually implemented a fix as of c19db0ab8fcec8d8196be7312a7d4fb6309184ce (see commit description for some details). I already tested it on the system in question and the memory consumption sunk quite a bit but more importantly: it remained constant.

@Blindfreddy: You're welcome. Also thanks for taking the time to report and letting me play around on your system. You can remove my ssh key again now.