Open didierga opened 6 years ago
Agreed, the double checking could probably be done more cleverly
I have the same problem.
Right now I'm in a process of sorting about 40T of data on spinning rust. fslint is a great help, but for my needs md5&sha1 verification is an overkill.
I've created PR https://github.com/pixelb/fslint/pull/145 with a change that allows the user to tune accuracy/safeness of duplicate verification to suitable level.
Under my understanding, at this time fslint does a double check for duplicate using md5sum then sha1sum to avoid md5sun collisions.
This double check is time consuming and in some case, depending of the amount and of the "value" of the files, I will prefer to have a faster single check mode with no sha1sum pass.
So I suggest to implement two modes: "Safe" the default one with double check and "Fast" with single check.