Open kabu0001 opened 4 months ago
Today I found this program and compiled it from the source code. I was very surprised to see the use of md5, although it seems that it is quite difficult to catch a real collision for files. In this case, it is more strange that the file size is not checked - it seems more critical than choosing an algorithm for hashing.
The hash is a fast exclusion feature. It is not used to determine if files are the same; it is used to determine if files are different. A hash collision does not cause a false positive duplicate result.
As the title says, i get erratic duplicate results if for some reason (dont ask me how that could happen) two files have the same name, same md5 hash but completely different sizes.