elisemercury / Duplicate-Image-Finder

difPy - Python package for finding duplicate or similar images within folders
https://difpy.readthedocs.io
MIT License
449 stars 66 forks source link

Near duplicate Image detection #29

Closed dhruvbhatnagar9548 closed 2 years ago

dhruvbhatnagar9548 commented 2 years ago

Hello, first of all thanks for creating this package It is really good package for detecting Duplicate images. I have tried this package I have found that it is able to detect images which are 100% similarity but I have found that it was not able to detect the images when similarity is not 100% even if similarity is 99.99% or less not able to detect image. I have tried to play with the pixel values and similarity but than also it was not able to detect. So, is there ways to detect such image which having similarity score less than 100% by using difpy package.

I have attached few images which it was not able to detect. Note:- The percentage values which I have refereed many times found from matchTemplate method the images which are attached having similarity is 99%.

TOI_Delhi_12-07-2022_4_1 TOI_Delhi_12-07-2022_4_2 TOI_Delhi_12-07-2022_4_3 TOI_Delhi_12-07-2022_4_7 TOI_Delhi_12-07-2022_4_8

elisemercury commented 2 years ago

Hi @dhruvbhatnagar9548, Thanks a lot for opening the issue and for the feedback! I just tested difPy on your images, and indeed it does not recognize them as duplicates. This is because of to the difference in tensors that the images have: due to the frame on some images, and the crop difference. DifPy computes the following MSE values for your images ("weareone" = a, "bus" = b):

DifPy only considers images as being similar if they have an MSE of less than 1000 (with the similarity parameter ='low'). Therefore, in order to solve your issue, you would need to adjust the MSE threshold. You can do this yourself by forking the difPy code and adjusting it, or wait until the next difPy release which will include an option to set the MSE threshold value directly from the similarity parameter.

I hope this helps!

Again, thanks a lot and all the best, Elise

dhruvbhatnagar9548 commented 2 years ago

Thanks for the reply @elisemercury . I will try the solution that you have mentioned. Thanks.