pauldreik / rdfind

find duplicate files utility
Other
979 stars 79 forks source link

Option to specify specific offset when finding dupes #137

Open negativeExponent opened 1 year ago

negativeExponent commented 1 year ago

for example, skip or ignore 100 bytes from the start of the file. this would be useful for files with a fixed header info or similar.

chrisulbrich commented 1 year ago

I've got a similar challenge. I've got many thousands of raw images (*.ARW) from my Sony camera. The header of these files in most cases is identical so the first bytes scan usually doesn't remove any files from list. Unfortunately in most cases the last bytes are identical, too. As a result usually the checksums of all files must be calculated, which is very expensive.

It would be very helpful to have options to define how many bytes the first or last bytes should be. I think it would be more versatile to read more bytes than using a offset in case if there are some files of other types (sidecar and project files in my case) between the big amount of files.