sahib / rmlint

Extremely fast tool to remove duplicates and other lint from your filesystem
http://rmlint.rtfd.org
GNU General Public License v3.0
1.92k stars 132 forks source link

High CPU usage and cannot complete #299

Closed saintger closed 6 years ago

saintger commented 6 years ago

Hello,

I am running the develop version (version 2.6.2 compiled: Aug 14 2018 at [06:54:35] "Penetrating Pineapple" (rev 888b8e2)) on a Btrfs filesystem with over 5 millions files on 2TB. I used the following command-line: rmlint --types="duplicates" --hidden --config=sh:handler=clone --no-hardlinked --algorithm=xxhash --progress /home After 6-8 hours of progress and with only 40GB remaining to scan, there is no progress anymore and "top" is reporting that rmlint consumes 100% of CPU. After 8 hours at 100% CPU usage, I killed rmlint.

Unfortunately I cannot use the timestamp filtering (I cannot rely on mtime because some duplicates are created with mtime in the past) and I would prefer not to store the checksum in the xattr.

I was wondering if it could be possible to avoid the checksuming entirely with Btrfs, because Btrfs is checking if the files are identical anyway before cloning/reflinking. So instead of calculating a checksum, in theory we can consider that files with the same size are duplicates and try to clone them. Btrfs will automatically ignore files which are not real duplicates.

Is it possible to do that with rmlint ? Are there any other way to make rmlint work with these 5 millions files ?

Thanks

SeeSpotRun commented 6 years ago

5 million files should be doable depending how much RAM you have. What does free show? If you run out of RAM then performance drops off drastically due to swap. Maybe consider limiting the run to files larger than 2048 bytes (-s 2048). Files smaller than this are generally stored as inline extents on btrfs, which means there is no benefit in cloning duplicates.

saintger commented 6 years ago

There must be something going on. I tried again but this time with only a subset (90 000 files, 40 GB) and just before completion (400 MB remaining) it is stuck again. free shows: total used free shared buffers cached Mem: 3731512 3599784 131728 40888 48 3195908 -/+ buffers/cache: 403828 3327684 Swap: 3898364 425388 3472976

And top shows: top - 06:47:17 up 3 days, 23:02, 3 users, load average: 1,10, 1,06, 1,01 Tasks: 256 total, 1 running, 255 sleeping, 0 stopped, 0 zombie %Cpu(s): 4,2 us, 29,6 sy, 3,8 ni, 52,7 id, 9,5 wa, 0,0 hi, 0,2 si, 0,0 st KiB Mem: 3731512 total, 3599900 used, 131612 free, 48 buffers KiB Swap: 3898364 total, 425388 used, 3472976 free. 3195904 cached Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
23257 root 20 0 842356 62644 5148 S 98,1 1,7 422:44.47 rmlint

Is there something I can do to debug this ?

Thanks a lot in advance,

SeeSpotRun commented 6 years ago

Hmmm annoying, haven't seen anything like that for a while. Can you try with a different hash, to at least rule out xxhash as the cause?

saintger commented 6 years ago

I have tried with murmur and I have the exact same problem. But I made a typo, the subset is roughly 1TB, not 400GB (I don't know if it makes a difference). The process gets stuck with 235MB remaining (and ETA 44 seconds !).

SeeSpotRun commented 6 years ago

Sorry to ask the dumb question, but you're sure you're running the rmlint version referenced in your first post (rmlint --version)? I just ask because I've been caught out before, and the behaviour you are seeing is a bit like an earlier bug that I thought we had ironed out. My other thought, in the interest of excluding a filesystem / OS problem, would be to check that all your files can be read without causing a similar lockup. Maybe something like find </mnt/target> -type f -exec dd status=progress if={} of=/dev/null bs=64k \;

saintger commented 6 years ago

Yes, I have deleted everything, made a fresh clone of the development version, compiled and installed in my local directory (/root/local). I am launching with /root/local/bin/rmlint in order to avoid any doubt

No problem as well with reading the files using your command (though it took close to 8-10 hours to read all of them, but I think it's because 'find' is creating a new thread each time).

Anything I can do to be sure that it is a bug ?

Thanks

SeeSpotRun commented 6 years ago

If you're willing to put up with having your console spammed, you could compile with an extra debugging option: lib/shredder.c change #define _RM_SHRED_DEBUG 0 to #define _RM_SHRED_DEBUG 1 Then compile and re-run with debug logging on (-vv) and once it finally hangs the last few debug messages might give us a clue.

saintger commented 6 years ago

I let it run with the debug log and it got stuck as planned. The lines are displaying very fast but it seems to be stuck in a loop. Unfortunately as soon as I Ctrl-C, a lot of new lines are being dumped on screen and I don't have the lines related to the loop. I am trying to redirect stdout and stderr in a file to show what is happening.

SeeSpotRun commented 6 years ago

ctrl-s/ctrl-q might be easier if the dump file gets too long

saintger commented 6 years ago

Indeed it is stuck in a loop. The following lines keep repeating:

DEBUG: Offsets match at logical=34865152, physical=3132931919872
DEBUG: Offsets match at logical=107216896, physical=3129837535232
DEBUG: Offsets match at logical=192413696, physical=3129978388480
DEBUG: Offsets match at logical=485752832, physical=3130272264192
DEBUG: Offsets match at logical=956825600, physical=3128505163776
DEBUG: Offsets match at logical=1225261056, physical=3128904646656
DEBUG: Offsets match at logical=1966342144, physical=3127377211392
DEBUG: Offsets match at logical=0, physical=3131874770944
DEBUG: Offsets match at logical=34865152, physical=3132931919872
DEBUG: Offsets match at logical=107216896, physical=3129837535232
DEBUG: Offsets match at logical=192413696, physical=3129978388480
DEBUG: Offsets match at logical=485752832, physical=3130272264192
DEBUG: Offsets match at logical=956825600, physical=3128505163776
DEBUG: Offsets match at logical=1225261056, physical=3128904646656
DEBUG: Offsets match at logical=1966342144, physical=3127377211392
DEBUG: Offsets match at logical=0, physical=3131874770944
SeeSpotRun commented 6 years ago

That's actually really helpful, looks like a potential infinite loop introduced in https://github.com/sahib/rmlint/commit/ee054972d3c7fa77c75c882b1e28c6f8b20f6466#diff-dea5d1c49fef1a6a3d5fd0912654d1d8R1101 Please try compiling from https://github.com/SeeSpotRun/rmlint/tree/debug_%23299. If you re-run with out -v | --verbose you should still get some sort of message about the file that was causing the loop.

saintger commented 6 years ago

Indeed, now it is finishing with the following message:

DEBUG: Offsets match at logical=1937219584, physical=2468772564992
DEBUG: Offsets match at logical=1977241600, physical=2479638458368
DEBUG: Offsets match at logical=2017263616, physical=2483899695104
DEBUG: Offsets match at logical=2044760064, physical=2484128587776
DEBUG: Offsets match at logical=2071437312, physical=2495333900288
DEBUG: Offsets match at logical=2111459328, physical=2573125619712
AVERTISSEMENT: rm_util_link_type() giving up: logical_next_1 <= logical_current for /home/user/backups/home/user/Images/general/video.MP4 vs /home/user/Images/general/video.MP4
AVERTISSEMENT: Unexpected return code 11 from rm_util_link_type()
DEBUG: Checking link type for /home/user/backups/home/user/Images/record.MP4 vs /home/user/Images/record.MP4
DEBUG: Files differ at offset 0: 4432691789824<> 4433232699392
DEBUG: Freeing device 64769 (pointer 0x1a254f0)
Waiting for progress counters to catch up...Done
DEBUG: Remaining 0 bytes in 0 files
SeeSpotRun commented 6 years ago

Great thanks. That bit of code is trying to determine whether duplicate files are already clones by looking at their fiemap data, to save time trying to do unnecessary clones during rmlint.sh. In this case the video.MP4 files are returning some fiemap data which is confusing my matching algorithm. Worst case this will mean we might fail to detect some existing clones, in which case rmlint.sh will try to clone them again. But it would be nice to understand what's going on. Would you mind posting the results of filefrag -v /home/user/backups/home/user/Images/general/video.MP4 and filefrag -v /home/user/Images/general/video.MP4?

saintger commented 6 years ago
filefrag -v "/home/user/backups/home/user/Images/general/video.MP4"
Filesystem type is: 9123683e
File size of  /home/user/backups/home/user/Images/general/video.MP4 is 2724300086 (665113 blocks of 4096 bytes)
 ext:     logical_offset:        physical_offset: length:   expected: flags:
   0:        0..    4008: 1067033652..1067037660:   4009:             shared
   1:     4009..    8259: 1067239186..1067243436:   4251: 1067037661: shared
   2:     8260..   12754: 1067362352..1067366846:   4495: 1067243437: shared
   3:    12755..   17490:  496547057.. 496551792:   4736: 1067366847: shared
   4:    17491..   21309:  497546482.. 497550300:   3819:  496551793: shared
   5:    21310..   27038:  501533251.. 501538979:   5729:  497550301: shared
   6:    27039..   32767:  503566283.. 503572011:   5729:  501538980: shared
   7:    32768..   36828:  503830144.. 503834204:   4061:  503572012: shared
   8:    36829..   41140:  504047134.. 504051445:   4312:  503834205: shared
   9:    41141..   45696:  504497651.. 504502206:   4556:  504051446: shared
  10:    45697..   50472:  505412084.. 505416859:   4776:  504502207: shared
  11:    50473..   54237:  505556250.. 505560014:   3765:  505416860: shared
  12:    54238..   59886:  506079480.. 506085128:   5649:  505560015: shared
  13:    59887..   65535:  508960720.. 508966368:   5649:  506085129: shared
  14:    65536..   69544:  509044651.. 509048659:   4009:  508966369: shared
  15:    69545..   73795:  511698929.. 511703179:   4251:  509048660: shared
  16:    73796..   78290:  517732924.. 517737418:   4495:  511703180: shared
  17:    78291..   83026:  549108733.. 549113468:   4736:  517737419: shared
  18:    83027..   86845:  549447765.. 549451583:   3819:  549113469: shared
  19:    86846..   92574:  553184390.. 553190118:   5729:  549451584: shared
  20:    92575..   98303:  553480448.. 553486176:   5729:  553190119: shared
  21:    98304..  102364:  553472640.. 553476700:   4061:  553486177: shared
  22:   102365..  109525:  558838217.. 558845377:   7161:  553476701: shared
  23:   109526..  116687:  602642352.. 602649513:   7162:  558845378: shared
  24:   116688..  120999:  602798096.. 602802407:   4312:  602649514: shared
  25:   121000..  125555:  602843920.. 602848475:   4556:  602802408: shared
  26:   125556..  130331:  603133231.. 603138006:   4776:  602848476: shared
  27:   130332..  135112:  605374482.. 605379262:   4781:  603138007: shared
  28:   135113..  142283:  725263166.. 725270336:   7171:  605379263: shared
  29:   142284..  149455:  929776853.. 929784024:   7172:  725270337: shared
  30:   149456..  154473:  929791968.. 929796985:   5018:  929784025: shared
  31:   154474..  159723:  929988225.. 929993474:   5250:  929796986: shared
  32:   159724..  165228:  930179927.. 930185431:   5505:  929993475: shared
  33:   165229..  169476:  930186569.. 930190816:   4248:  930185432: shared
  34:   169477..  175849: 1056445626..1056451998:   6373:  930190817: shared
  35:   175850..  182223:  995467795.. 995474168:   6374: 1056451999: shared
  36:   182224..  186718:  995609689.. 995614183:   4495:  995474169: shared
  37:   186719..  191454:  998506023.. 998510758:   4736:  995614184: shared
  38:   191455..  196424: 1001560036..1001565005:   4970:  998510759: shared
  39:   196425..  201065: 1001739956..1001744596:   4641: 1001565006: shared
  40:   201066..  208028:  558349465.. 558356427:   6963: 1001744597: shared
  41:   208029..  214991:  559342433.. 559349395:   6963:  558356428: shared
  42:   214992..  219830:  559352473.. 559357311:   4839:  559349396: shared
  43:   219831..  224919:  559435417.. 559440505:   5089:  559357312: shared
  44:   224920..  229861:  559470307.. 559475248:   4942:  559440506: shared
  45:   229862..  233567:  559484558.. 559488263:   3706:  559475249: shared
  46:   233568..  239127:  559538848.. 559544407:   5560:  559488264: shared
  47:   239128..  244687:  559696449.. 559702008:   5560:  559544408: shared
  48:   244688..  252879:  562994081.. 563002272:   8192:  559702009: shared
  49:   252880..  265167:  873573198.. 873585485:  12288:  563002273: shared
  50:   265168..  277455:  909563088.. 909575375:  12288:  873585486: shared
  51:   277456..  285647:  916345199.. 916353390:   8192:  909575376: shared
  52:   285648..  297935:  928939281.. 928951568:  12288:  916353391: shared
  53:   297936..  310223: 1058731595..1058743882:  12288:  928951569: shared
  54:   310224..  317903: 1055635806..1055643485:   7680: 1058743883: shared
  55:   317904..  329423:  999532801.. 999544320:  11520: 1055643486: shared
  56:   329424..  340943: 1000073292..1000084811:  11520:  999544321: shared
  57:   340944..  348860: 1023649095..1023657011:   7917: 1000084812: shared
  58:   348861..  355072: 1027520531..1027526742:   6212: 1023657012: shared
  59:   355073..  364391: 1029848181..1029857499:   9319: 1027526743: shared
  60:   364392..  373711: 1032270425..1032279744:   9320: 1029857500: shared
  61:   373712..  380113: 1034793984..1034800385:   6402: 1032279745: shared
  62:   380114..  386704: 1035799834..1035806424:   6591: 1034800386: shared
  63:   386705..  396591: 1045668147..1045678033:   9887: 1035806425: shared
  64:   396592..  406479: 1048292951..1048302838:   9888: 1045678034: shared
  65:   406480..  413295: 1049763314..1049770129:   6816: 1048302839: shared
  66:   413296..  419783: 1059431023..1059437510:   6488: 1049770130: shared
  67:   419784..  429515: 1067017873..1067027604:   9732: 1059437511: shared
  68:   429516..  439247:  553453696.. 553463427:   9732: 1067027605: shared
  69:   439248..  449487:  554902864.. 554913103:  10240:  553463428: shared
  70:   449488..  459727:  574666118.. 574676357:  10240:  554913104: shared
  71:   459728..  466440:  575228122.. 575234834:   6713:  574676358: shared
  72:   466441..  472953:  575536781.. 575543293:   6513:  575234835: shared
  73:   472954..  482724:  602727677.. 602737447:   9771:  575543294: shared
  74:   482725..  492495:  605380483.. 605390253:   9771:  602737448: shared
  75:   492496..  499208:  606420824.. 606427536:   6713:  605390254: shared
  76:   499209..  505721:  606476706.. 606483218:   6513:  606427537: shared
  77:   505722..  515492:  609212378.. 609222148:   9771:  606483219: shared
  78:   515493..  525263:  628204497.. 628214267:   9771:  609222149: shared
  79:   525264..  528383:  633099327.. 633102446:   3120:  628214268: shared
  80:   528384..  531976:  633102447.. 633106039:   3593:             shared
  81:   531977..  532479:  633231038.. 633231540:    503:  633106040: shared
  82:   532480..  536575:  633231541.. 633235636:   4096:             shared
  83:   536576..  537477:  633235637.. 633236538:    902:             shared
  84:   537478..  540671:  634162521.. 634165714:   3194:  633236539: shared
  85:   540672..  544767:  634165715.. 634169810:   4096:             shared
  86:   544768..  545730:  634169811.. 634170773:    963:             shared
  87:   545731..  548863:  634488091.. 634491223:   3133:  634170774: shared
  88:   548864..  552959:  634491224.. 634495319:   4096:             shared
  89:   552960..  553983:  634495320.. 634496343:   1024:             shared
  90:   553984..  554159:  634496344.. 634496519:    176:             shared
  91:   554160..  557055:  634676335.. 634679230:   2896:  634496520: shared
  92:   557056..  559872:  634679231.. 634682047:   2817:             shared
  93:   559873..  561151:  635689972.. 635691250:   1279:  634682048: shared
  94:   561152..  565247:  635691251.. 635695346:   4096:             shared
  95:   565248..  565743:  635695347.. 635695842:    496:             shared
  96:   565744..  569343:  872441080.. 872444679:   3600:  635695843: shared
  97:   569344..  573439:  872444680.. 872448775:   4096:             shared
  98:   573440..  576335:  872448776.. 872451671:   2896:             shared
  99:   576336..  577535:  929457967.. 929459166:   1200:  872451672: shared
 100:   577536..  581631:  929459167.. 929463262:   4096:             shared
 101:   581632..  585727:  929463263.. 929467358:   4096:             shared
 102:   585728..  586927:  929467359.. 929468558:   1200:             shared
 103:   586928..  589823:  930210626.. 930213521:   2896:  929468559: shared
 104:   589824..  593019:  930213522.. 930216717:   3196:             shared
 105:   593020..  593919:  930526145.. 930527044:    900:  930216718: shared
 106:   593920..  598015:  930527045.. 930531140:   4096:             shared
 107:   598016..  599316:  930531141.. 930532441:   1301:             shared
 108:   599317..  602111:  790031272.. 790034066:   2795:  930532442: shared
 109:   602112..  606207:  790034067.. 790038162:   4096:             shared
 110:   606208..  609505:  790038163.. 790041460:   3298:             shared
 111:   609506..  610303:  872728350.. 872729147:    798:  790041461: shared
 112:   610304..  614399:  872729148.. 872733243:   4096:             shared
 113:   614400..  614600:  872733244.. 872733444:    201:             shared
 114:   614601..  618495:  872735523.. 872739417:   3895:  872733445: shared
 115:   618496..  619695:  872739418.. 872740617:   1200:             shared
 116:   619696..  622591:  873435929.. 873438824:   2896:  872740618: shared
 117:   622592..  624999:  873438825.. 873441232:   2408:             shared
 118:   625000..  626687:  874723310.. 874724997:   1688:  873441233: shared
 119:   626688..  630783:  874724998.. 874729093:   4096:             shared
 120:   630784..  632955:  874729094.. 874731265:   2172:             shared
 121:   632956..  634879:  874745838.. 874747761:   1924:  874731266: shared
 122:   634880..  638975:  874747762.. 874751857:   4096:             shared
 123:   638976..  640911:  874751858.. 874753793:   1936:             shared
 124:   640912..  643071:  874734766.. 874736925:   2160:  874753794: shared
 125:   643072..  646416:  874736926.. 874740270:   3345:             shared
 126:   646417..  647167:  883143018.. 883143768:    751:  874740271: shared
 127:   647168..  651263:  883143769.. 883147864:   4096:             shared
 128:   651264..  655359:  883147865.. 883151960:   4096:             shared
 129:   655360..  655764:  883151961.. 883152365:    405:             shared
 130:   655765..  659455:  898588849.. 898592539:   3691:  883152366: shared
 131:   659456..  663551:  898592540.. 898596635:   4096:             shared
 132:   663552..  665112:  898596636.. 898598196:   1561:             last,shared,eof
/home/user/backups/home/user/Images/general/video.MP4: 98 extents found

filefrag -v "/home/user/Images/general/video.MP4"
Filesystem type is: 9123683e
File size of /home/user/Images/general/video.MP4 is 18039872299 (4404266 blocks of 4096 bytes)
 ext:     logical_offset:        physical_offset: length:   expected: flags:
   0:        0..   65535: 1082332202..1082397737:  65536:            
   1:    65536..   98303: 1075336192..1075368959:  32768: 1082397738:
   2:    98304..  233855: 1079268352..1079403903: 135552: 1075368960:
   3:   233856..  266623: 1079431872..1079464639:  32768: 1079403904:
   4:   266624..  299391: 1079464640..1079497407:  32768:            
   5:   299392..  332159: 1079792640..1079825407:  32768: 1079497408:
   6:   332160..  336063: 1079825408..1079829311:   3904:            
   7:   336064..  336127: 1079829312..1079829375:     64:            
   8:   336128..  368895: 1079829376..1079862143:  32768:            
   9:   368896..  602367: 1082414080..1082647551: 233472: 1079862144:
  10:   602368..  635135: 1079862144..1079894911:  32768: 1082647552:
  11:   635136..  667903: 1079894912..1079927679:  32768:            
  12:   667904..  686463: 1079927680..1079946239:  18560:            
  13:   686464..  687423: 1079946240..1079947199:    960:            
  14:   687424..  687615: 1079947200..1079947391:    192:            
  15:   687616..  720383: 1079947392..1079980159:  32768:            
  16:   720384..  753151: 1079980160..1080012927:  32768:            
  17:   753152..  785919: 1080012928..1080045695:  32768:            
  18:   785920..  818687: 1080316928..1080349695:  32768: 1080045696:
  19:   818688..  851455: 1080349696..1080382463:  32768:            
  20:   851456..  854335: 1080382464..1080385343:   2880:            
  21:   854336..  887103: 1080385344..1080418111:  32768:            
  22:   887104..  919871: 1082676224..1082708991:  32768: 1080418112:
  23:   919872..  952639: 1082708992..1082741759:  32768:            
  24:   952640..  985407: 1082741760..1082774527:  32768:            
  25:   985408.. 1018175: 1082774528..1082807295:  32768:            
  26:  1018176.. 1021311: 1082807296..1082810431:   3136:            
  27:  1021312.. 1021375: 1082810432..1082810495:     64:            
  28:  1021376.. 1054143: 1082810496..1082843263:  32768:            
  29:  1054144.. 1086911: 1082843264..1082876031:  32768:            
  30:  1086912.. 1119679: 1082876032..1082908799:  32768:            
  31:  1119680.. 1152447: 1082938368..1082971135:  32768: 1082908800:
  32:  1152448.. 1185215: 1082971136..1083003903:  32768:            
  33:  1185216.. 1197567: 1083003904..1083016255:  12352:            
  34:  1197568.. 1198143: 1083016256..1083016831:    576:            
  35:  1198144.. 1198655: 1083016832..1083017343:    512:            
  36:  1198656.. 1198783: 1083017344..1083017471:    128:            
  37:  1198784.. 1231551: 1083018368..1083051135:  32768: 1083017472:
  38:  1231552.. 1264319: 1083051136..1083083903:  32768:            
  39:  1264320.. 1297087: 1083083904..1083116671:  32768:            
  40:  1297088.. 1329855: 1083116672..1083149439:  32768:            
  41:  1329856.. 1362623: 1083149440..1083182207:  32768:            
  42:  1362624.. 1367743: 1083182208..1083187327:   5120:            
  43:  1367744.. 1367807: 1083187328..1083187391:     64:            
  44:  1367808.. 1400575: 1075368960..1075401727:  32768: 1083187392:
  45:  1400576.. 1433343: 1075401728..1075434495:  32768:            
  46:  1433344.. 1466111: 1075434496..1075467263:  32768:            
  47:  1466112.. 1498879: 1075467264..1075500031:  32768:            
  48:  1498880.. 1531647: 1075500032..1075532799:  32768:            
  49:  1531648.. 1564415: 1075555104..1075587871:  32768: 1075532800:
  50:  1564416.. 1597183: 1075598336..1075631103:  32768: 1075587872:
  51:  1597184.. 1623935: 1075631104..1075657855:  26752:            
  52:  1623936.. 1625215: 1075657856..1075659135:   1280:            
  53:  1625216.. 1657983: 1075659136..1075691903:  32768:            
  54:  1657984.. 1690751: 1076122624..1076155391:  32768: 1075691904:
  55:  1690752.. 1723519: 1076155392..1076188159:  32768:            
  56:  1723520.. 1756287: 1076212680..1076245447:  32768: 1076188160:
  57:  1756288.. 1789055: 1076245448..1076278215:  32768:            
  58:  1789056.. 1821823: 1080418112..1080450879:  32768: 1076278216:
  59:  1821824.. 1854591: 1080451776..1080484543:  32768: 1080450880:
  60:  1854592.. 1887359: 1080484672..1080517439:  32768: 1080484544:
  61:  1887360.. 1907711: 1080517440..1080537791:  20352:            
  62:  1907712.. 1940479: 1080537792..1080570559:  32768:            
  63:  1940480.. 1973247:  651276330.. 651309097:  32768: 1080570560:
  64:  1973248.. 2006015: 1076278216..1076310983:  32768:  651309098:
  65:  2006016.. 2038783: 1076310984..1076343751:  32768:            
  66:  2038784.. 2071551: 1076343752..1076376519:  32768:            
  67:  2071552.. 2326847: 1083200512..1083455807: 255296: 1076376520:
  68:  2326848.. 2359615: 1083462656..1083495423:  32768: 1083455808:
  69:  2359616.. 2392383: 1083495424..1083528191:  32768:            
  70:  2392384.. 2425151: 1083528192..1083560959:  32768:            
  71:  2425152.. 2457919: 1083560960..1083593727:  32768:            
  72:  2457920.. 2490111: 1083593728..1083625919:  32192:            
  73:  2490112.. 2522879: 1083626560..1083659327:  32768: 1083625920:
  74:  2522880.. 2555647: 1083659328..1083692095:  32768:            
  75:  2555648.. 2564863: 1083692096..1083701311:   9216:            
  76:  2564864.. 2565759: 1083701312..1083702207:    896:            
  77:  2565760.. 2815743: 1083724800..1083974783: 249984: 1083702208:
  78:  2815744.. 2829695:  479907141.. 479921092:  13952: 1083974784:
  79:  2829696.. 2830143:  479728642.. 479729089:    448:  479921093:
  80:  2830144.. 2831295:  479730085.. 479731236:   1152:  479729090:
  81:  2831296.. 2864063: 1072190464..1072223231:  32768:  479731237:
  82:  2864064.. 2896831: 1072223232..1072255999:  32768:            
  83:  2896832.. 2929599: 1072256000..1072288767:  32768:            
  84:  2929600.. 2962367: 1072299937..1072332704:  32768: 1072288768:
  85:  2962368.. 2995135: 1072332836..1072365603:  32768: 1072332705:
  86:  2995136.. 3027903: 1081103360..1081136127:  32768: 1072365604:
  87:  3027904.. 3060671: 1081136128..1081168895:  32768:            
  88:  3060672.. 3093439: 1081168896..1081201663:  32768:            
  89:  3093440.. 3113535: 1081201664..1081221759:  20096:            
  90:  3113536.. 3114175: 1081221760..1081222399:    640:            
  91:  3114176.. 3114303: 1081222400..1081222527:    128:            
  92:  3114304.. 3147071: 1081235008..1081267775:  32768: 1081222528:
  93:  3147072.. 3179839: 1081267776..1081300543:  32768:            
  94:  3179840.. 3211839: 1081300544..1081332543:  32000:            
  95:  3211840.. 3244607: 1081332544..1081365311:  32768:            
  96:  3244608.. 3277375: 1083986944..1084019711:  32768: 1081365312:
  97:  3277376.. 3287359: 1084019712..1084029695:   9984:            
  98:  3287360.. 3288063: 1084029696..1084030399:    704:            
  99:  3288064.. 3288447: 1084030400..1084030783:    384:            
 100:  3288448.. 3288575: 1084030784..1084030911:    128:            
 101:  3288576.. 3289983: 1084030912..1084032319:   1408:            
 102:  3289984.. 3322751: 1084033152..1084065919:  32768: 1084032320:
 103:  3322752.. 3355519: 1084065920..1084098687:  32768:            
 104:  3355520.. 3368127: 1084098688..1084111295:  12608:            
 105:  3368128.. 3368255: 1084111296..1084111423:    128:            
 106:  3368256.. 3368383: 1084111424..1084111551:    128:            
 107:  3368384.. 3368447: 1084111552..1084111615:     64:            
 108:  3368448.. 3368703: 1084111616..1084111871:    256:            
 109:  3368704.. 3368895: 1084111872..1084112063:    192:            
 110:  3368896.. 3369087: 1084112064..1084112255:    192:            
 111:  3369088.. 3369343: 1084112256..1084112511:    256:            
 112:  3369344.. 3369407: 1084112512..1084112575:     64:            
 113:  3369408.. 3369727: 1084112576..1084112895:    320:            
 114:  3369728.. 3369791: 1084112896..1084112959:     64:            
 115:  3369792.. 3370111: 1084112960..1084113279:    320:            
 116:  3370112.. 3370175: 1084113280..1084113343:     64:            
 117:  3370176.. 3402943: 1084113472..1084146239:  32768: 1084113344:
 118:  3402944.. 3435711: 1084146240..1084179007:  32768:            
 119:  3435712.. 3468479: 1084179008..1084211775:  32768:            
 120:  3468480.. 3501247: 1084211776..1084244543:  32768:            
 121:  3501248.. 3520703:  491033978.. 491053433:  19456: 1084244544:
 122:  3520704.. 3521343:  491053434.. 491054073:    640:            
 123:  3521344.. 3521407:  491054074.. 491054137:     64:            
 124:  3521408.. 3521535:  491054138.. 491054265:    128:            
 125:  3521536.. 3521663:  491054266.. 491054393:    128:            
 126:  3521664.. 3521727:  491054394.. 491054457:     64:            
 127:  3521728.. 3521851:  491054458.. 491054581:    124:            
 128:  3521852.. 3521983:  491054582.. 491054713:    132:            
 129:  3521984.. 3522047:  491054714.. 491054777:     64:            
 130:  3522048.. 3554815: 1072365604..1072398371:  32768:  491054778:
 131:  3554816.. 3587583: 1072398372..1072431139:  32768:            
 132:  3587584.. 3620351: 1077433344..1077466111:  32768: 1072431140:
 133:  3620352.. 3653119: 1077466112..1077498879:  32768:            
 134:  3653120.. 3685887: 1077498880..1077531647:  32768:            
 135:  3685888.. 3690367: 1077531648..1077536127:   4480:            
 136:  3690368.. 3723135: 1077536128..1077568895:  32768:            
 137:  3723136.. 3755903: 1077568896..1077601663:  32768:            
 138:  3755904.. 3788671: 1077622700..1077655467:  32768: 1077601664:
 139:  3788672.. 3821439: 1077655468..1077688235:  32768:            
 140:  3821440.. 3854207: 1084249088..1084281855:  32768: 1077688236:
 141:  3854208.. 3886975: 1084281856..1084314623:  32768:            
 142:  3886976.. 3919743: 1084314624..1084347391:  32768:            
 143:  3919744.. 3952511: 1084347392..1084380159:  32768:            
 144:  3952512.. 3958079: 1084380160..1084385727:   5568:            
 145:  3958080.. 3958335: 1084385728..1084385983:    256:            
 146:  3958336.. 3958399: 1084385984..1084386047:     64:            
 147:  3958400.. 3958463: 1084386048..1084386111:     64:            
 148:  3958464.. 3958847: 1084386112..1084386495:    384:            
 149:  3958848.. 3959039: 1084386496..1084386687:    192:            
 150:  3959040.. 3959463: 1084386688..1084387111:    424:            
 151:  3959464.. 3959487: 1084387112..1084387135:     24:            
 152:  3959488.. 3959679: 1084387136..1084387327:    192:            
 153:  3959680.. 3960191: 1084387328..1084387839:    512:            
 154:  3960192.. 3960255: 1084387840..1084387903:     64:            
 155:  3960256.. 3960319: 1084387904..1084387967:     64:            
 156:  3960320.. 3960383: 1084387968..1084388031:     64:            
 157:  3960384.. 3960575: 1084388032..1084388223:    192:            
 158:  3960576.. 3960639: 1084388224..1084388287:     64:            
 159:  3960640.. 3960703: 1084388288..1084388351:     64:            
 160:  3960704.. 3960831: 1084388352..1084388479:    128:            
 161:  3960832.. 3961343: 1084388480..1084388991:    512:            
 162:  3961344.. 3961471: 1084388992..1084389119:    128:            
 163:  3961472.. 3962111: 1084389120..1084389759:    640:            
 164:  3962112.. 3962175: 1084389760..1084389823:     64:            
 165:  3962176.. 3962239: 1084389824..1084389887:     64:            
 166:  3962240.. 3962303: 1084389888..1084389951:     64:            
 167:  3962304.. 3962367: 1084389952..1084390015:     64:            
 168:  3962368.. 3962623: 1084390016..1084390271:    256:            
 169:  3962624.. 3962687: 1084390272..1084390335:     64:            
 170:  3962688.. 3962815: 1084390336..1084390463:    128:            
 171:  3962816.. 3963007: 1084390464..1084390655:    192:            
 172:  3963008.. 3963071: 1084390656..1084390719:     64:            
 173:  3963072.. 3963199: 1084390720..1084390847:    128:            
 174:  3963200.. 3963263: 1084390848..1084390911:     64:            
 175:  3963264.. 3963327: 1084390912..1084390975:     64:            
 176:  3963328.. 3964159: 1084390976..1084391807:    832:            
 177:  3964160.. 3964351: 1084391808..1084391999:    192:            
 178:  3964352.. 3964415: 1084392000..1084392063:     64:            
 179:  3964416.. 3964479: 1084392064..1084392127:     64:            
 180:  3964480.. 3964543: 1084392128..1084392191:     64:            
 181:  3964544.. 3964607: 1084392192..1084392255:     64:            
 182:  3964608.. 3964671: 1084392256..1084392319:     64:            
 183:  3964672.. 3964735: 1084392320..1084392383:     64:            
 184:  3964736.. 3964799: 1084392384..1084392447:     64:            
 185:  3964800.. 3964863: 1084392448..1084392511:     64:            
 186:  3964864.. 3964991: 1084392512..1084392639:    128:            
 187:  3964992.. 3965055: 1084392640..1084392703:     64:            
 188:  3965056.. 3965119: 1084392704..1084392767:     64:            
 189:  3965120.. 3966399: 1084392768..1084394047:   1280:            
 190:  3966400.. 3966527: 1084394048..1084394175:    128:            
 191:  3966528.. 3966591: 1084394176..1084394239:     64:            
 192:  3966592.. 3967231: 1084394240..1084394879:    640:            
 193:  3967232.. 3967295: 1084394880..1084394943:     64:            
 194:  3967296.. 3967871: 1084394944..1084395519:    576:            
 195:  3967872.. 4000639: 1084396608..1084429375:  32768: 1084395520:
 196:  4000640.. 4033407: 1084429376..1084462143:  32768:            
 197:  4033408.. 4066175: 1084462144..1084494911:  32768:            
 198:  4066176.. 4098943: 1072976896..1073009663:  32768: 1084494912:
 199:  4098944.. 4131711: 1073010176..1073042943:  32768: 1073009664:
 200:  4131712.. 4164479: 1073042944..1073075711:  32768:            
 201:  4164480.. 4197247: 1073075712..1073108479:  32768:            
 202:  4197248.. 4230015: 1073108480..1073141247:  32768:            
 203:  4230016.. 4260543: 1073141248..1073171775:  30528:            
 204:  4260544.. 4261823: 1073171776..1073173055:   1280:            
 205:  4261824.. 4261887: 1073009664..1073009727:     64: 1073173056:
 206:  4261888.. 4294655: 1084511232..1084543999:  32768: 1073009728:
 207:  4294656.. 4327423: 1084544000..1084576767:  32768:            
 208:  4327424.. 4355391: 1084576768..1084604735:  27968:            
 209:  4355392.. 4356927: 1084604736..1084606271:   1536:            
 210:  4356928.. 4357183: 1084606272..1084606527:    256:            
 211:  4357184.. 4389951: 1084607169..1084639936:  32768: 1084606528:
 212:  4389952.. 4404264: 1084639937..1084654249:  14313:            
 213:  4404265.. 4404265: 1084606528..1084606528:      1: 1084654250: last,eof
/home/user/Images/general/video.MP4: 48 extents found
SeeSpotRun commented 6 years ago

Wow that's an ugly fiemap for the first file - all the extents are odd sizes. Anyway it's not a clone of the second file so re are getting the correct result. I'll tidy up the code a bit and submit a PR, meanwhile the version linked above should be safe to use.

saintger commented 6 years ago

Normally the files should be identical (and so a clone of each other). As the name suggest, one is the backup/copy of the other (made with the Burp backup software). I'll check later this day if the files are really different (binary diff). I am not familiar with fiemap and extents so I don't understand the implications or why it means that the files are not clone of each other. However I can check with the author of Burp what he is doing exactly when his software is copying/backuping.

Thanks

saintger commented 6 years ago

Sorry I was mistaken: the files are completely different (as suggested by their different size). What could be the reason for the odd size of the extent ? The file is generated by a Samsung camera.

Thanks !

SeeSpotRun commented 6 years ago

Files can be identical (referencing identical data stored in different disk locations) without being clones (referencing the same disk locations).

The fragmentation and odd extent sizes probably means the data we written to disk with small, inefficient buffer sizes. Or possibly as a result of editing a file in-situ, re-writing odd sized blocks of data.

You can 'fix' this by defragmenting the file.

sahib commented 6 years ago

The fix is now in develop. If you can test it there again, it would be nice. I will close the ticket otherwise in a few days.

saintger commented 6 years ago

It works without any problem now. Thanks a lot.

magnetophon commented 3 years ago

Not sure if it's best to open a new issue or to necro this one, but I seem to have the exact same thing with 2.10.1.

It has been into swap for about a day, but the first few hours of that, it was still making progress, albeit slower then before. Now it has been totally stuck for a few hours:

rmlint /mnt/bu -p -g -T minimaldirs
▕░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒░░▒▏                                                                                                                                                              Traversing (3314736 usable files / 0 + 0 ignored files / folders)

▕░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░▏                                                                       Traversing (3315440 usable files / 0 + 0 ignored files / folders)
▕░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░▏                                                                                                                                                            Preprocessing (reduces files to 1863413 / found 1402576 other lint)
▕▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓[3^[i▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▒░░░░░░░░░▏                                                                                                                                     
Matching (732729 dupes of 379340 originals; 183.60 GB to scan in 2224 files, ETA:  6m 20s)

Should I kill it and try find </mnt/target> -type f -exec dd status=progress if={} of=/dev/null bs=64k \;?

Further details:

This is on a server with 64G of RAM, scanning a zfs filesystem with 4.20T used and 2.63T free. rmlint is using 93.9G VIRT and 59.3G RES, according to htop. cpu usage has been around 157% for hours. Total swap usage is 26.2G/128G. It was still making progress when swap was at 12G.

SeeSpotRun commented 3 years ago

Not sure if it's best to open a new issue or to necro this one

Looks like a different issue (different root cause) but I'm happy to discuss it here.

rmlint is using 93.9G VIRT and 59.3G RES, according to htop

The 'paranoid' (-p) hashing algorithm can be very memory-intensive. We try to hash files in an order that minimises disk thrash on spinning media, but that can increase the number of files that are being hashed simultaneously. To manage this, there is a crude memory-limiting algorithm in rmlint here. It's supposed to delay starting too many groups of dupe candidates at the same time, waiting until other groups have finished so that their mem allocation is freed up.

It's possible that either that memory management is not working, or that the earlier mem allocations are not being released effectively here or here. But the 183.60 GB to scan in 2224 files suggests that the rmlint mem limiter is probably not the issue (there are at most 2224 files still being actively hashed; rmlint is not trying to hash hundreds of thousands of files at the same time).

I have a suspicion that glib's GSlice allocator (that we rely on) struggles with large numbers of multi-threaded alloc and free once you get into disk swaps. So even if rmlint is freeing its buffers appropriately, there's a bottleneck in the GSlice system somewhere. I've previously had my system gum right up, similarly to what you're seeing, when doing tests on large synthetic datasets.

TLDR: The -p option you are using on a rather large dataset is pushing the limits of rmlint or GSlice or both, and it's probably not something that will be fixed in the short term.

Alternative: re-run without -p (a 512-bit hash has vanishingly small chance of hash collisions). You can later use the -p option if/when you run the generated shell script.

magnetophon commented 3 years ago

Thanks! I'll try without -p then. Would -s 2048 also make sense for me? Most space is taken by large .wav files.

You can later use the -p option if/when you run the generated shell script.

I don't understand, please explain.

SeeSpotRun commented 3 years ago

Would -s 2048 also make sense for me? Most space is taken by large .wav files

It's probably going to help.

You can later use the -p option if/when you run the generated shell script.

By this I was referring to the -p option for rmlint.sh (the auto-generated cleanup script output from rmlint. If you run ./rmlint.sh -p it will do a bytewise compare of each duplicate before deleting:

usage: ./rmlint.sh OPTIONS

OPTIONS:

  -h   Show this message.
  -d   Do not ask before running.
  -x   Keep rmlint.sh; do not autodelete it.
  -p   Recheck that files are still identical before removing duplicates.
  -r   Allow deduplication of files on read-only btrfs snapshots. (requires sudo)
  -n   Do not perform any modifications, just print what would be done. (implies -d and -x)
  -c   Clean up empty directories while deleting duplicates.
  -q   Do not show progress.
  -k   Keep the timestamp of directories when removing duplicates.
  -i   Ask before deleting each file
SeeSpotRun commented 3 years ago

You could also try the following tweak rmlint --read-buffer-len 127k ...

This will use 127 kbyte slices instead of default 16k, so less mem fragmentation. Also 127 instead of 128k because GSlice has some overhead bytes per slice, so 127k slices might align better than 128k.

Then again, it seems GSlice has issues so maybe try

$ G_SLICE=always-malloc rmlint /mnt/bu -p -g -T minimaldirs --read-buffer-len 127k
SeeSpotRun commented 3 years ago

Would -s 2048 also make sense for me?

Actually looking again, you're using -T minimaldirs which means you're looking for identical folders but not individual identical files. With -s 2048, even one small file in a folder will prevent identical folder detection.

magnetophon commented 3 years ago

Thanks for the heads up!

magnetophon commented 3 years ago

I'm getting very similar results:

G_SLICE=always-malloc rmlint /mnt/bu -p -g -T minimaldirs --read-buffer-len 127k
▕░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░▏                                                                                                                                                              Traversing (3315440 usable files / 0 + 0 ignored files / folders)
▕░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░▏                                                                                                                                                            Preprocessing (reduces files to 1863414 / found 1402575 other lint)
▕▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▒░░░░░░░░░▏                                                                                                                                     Matching (732728 dupes of 379340 originals; 183.59 GB to scan in 2225 files, ETA: 14m 45s)

I find it curious that both scans get stuck at 183.6 GB.

SeeSpotRun commented 3 years ago

Is there a reason you are looking for matching dirs? There are a few issues with dir matching (minimaldirs / --merge-directories / -T dd) including the fact that we have to keep all file metadata, paths, checksums etc in ram until after all hashing is finished. If you can run with default lint types or -T minimal that will identify whether it's a dir matching issue. Alternatively try your dd idea to see if there's a corrupt file that's not reading (although the high mem allocation doesn't point to that).

magnetophon commented 3 years ago

Is there a reason you are looking for matching dirs?

Yes: Over the years, some dirs got copied a couple of times to a couple of harddisks, and I want to:

I expect the last case to happen a lot, because the dirs are mostly ardour projects, which have a lot of big wav files that where all recorded on the same day (later copies of those dirs will have the same wav files), in addition to a few small xml files, containing the many hours of mixing work I did on them. Ideally I want to end up with just the latest version of each project, and it's a bit much to do with rmlint by comparing files.

I'll run the file compare anyway and will keep you in the loop.

Thanks!

magnetophon commented 3 years ago

As I went to run the command, I noticed my last incarnation still had -p in it! :facepalm:

I'm now running G_SLICE=always-malloc rmlint /mnt/bu -g -T minimaldirs --read-buffer-len 127k and if that doesn't work, I'll try G_SLICE=always-malloc rmlint /mnt/bu -g -T minimal --read-buffer-len 127k.

magnetophon commented 3 years ago

Sorry for the late update. It now ran and finished in just 7 hours.

SeeSpotRun commented 3 years ago

So minimaldirs without -p was ok?

I'm curious is the inverse is also ok, in which case the likely cause is the cumulative memory drain of paranoid read buffers and treemerge retained file metadata. In which case we could consider alternatives such as json file caching of the metadata for treemerge.

magnetophon commented 3 years ago

I haven't merged the files yet, so I can try out another command. What exactly would you like me to run?

SeeSpotRun commented 3 years ago

Hi @magnetophon , sorry this slipped under my radar, I think I was off fishing. I presume you have merged the files in the meantime. If yes then please close the issue (Edit: oops it's already closed).

If not, then was interested to try:

$ rmlint /mnt/bu -g -p

... to determine if the problem was -p alone, or (as I suspect) the combination of -p and -T minimaldirs

magnetophon commented 3 years ago

No problem.

I have not merged the files yet, I lost contact with the machine and will take a look next time I visit my parents.

magnetophon commented 3 years ago

That gives me:

[root@pronix:/home/bart]# rmlint /mnt/bu -g -p
▕░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░▏                                                                                                                                                       Traversing (2259625 usable files / 44678 + 4923 ignored files / folders)
▕░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░▏                                                                                                                                                            Preprocessing (reduces files to 1205684 / found 1012768 other lint)
▕▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▏                                                                                                                                            Matching (502362 dupes of 260508 originals; 5.41 GB to scan in 171 files, ETA:  9s)

And then it stops moving.

Anything else I should test before I merge the files?

SeeSpotRun commented 3 years ago

Anything else I should test before I merge the files?

If you haven't already, please test with latest develop version. If it stops moving can you check what free shows. Edit: also top to see if it's stuck in an infinite loop or just a threadlock.

Thanks.