elisemercury / Duplicate-Image-Finder

difPy - Python package for finding duplicate and similar images
https://difpy.readthedocs.io
MIT License
466 stars 67 forks source link

Wrong result on large dataset > 5K images #103

Open nupnik267 opened 2 weeks ago

nupnik267 commented 2 weeks ago

I tried on 13K images but results are not coming correct, on <5K images it’s working great. So my issue is even though my similarity is set to 5(i.e default), search.result contain result with >5 mse i.e even 100,101,etc mse are appended to it, also whose mse is 0 is not the exact duplicate of each other. I think there is issue in mapping and result generation of images with >5K.

Note: I am passing multiple directories as list. image image

elisemercury commented 1 week ago

Hi @nupnik267,

Thanks for opening the issue!

There have been some improvements and bug fixes in the algorithm with version 4.1.2. May I ask you to test the new version on the same dataset and see whether the issues still persist?

Thanks, Elise