idealo / imagededup

😎 Finding duplicate images made easy!
https://idealo.github.io/imagededup/
Apache License 2.0
5.18k stars 459 forks source link

When I use PHash in mac, It's stuck. #195

Open code-killerr opened 1 year ago

code-killerr commented 1 year ago

I'm waitting a long time in here, for about 5mins - 10mins and still stuck image

System: macOs python: 3.8.16

tanujjain commented 1 year ago

I can reproduce it too. Will have a look in a bit.

Meanwhile, you could make progress by using 'bktree' as the search method like below:

hasher.find_duplicates('/path/to/image/directory', search_method='bktree')

The search will be slower than cython brute force, but will work.

code-killerr commented 1 year ago

Ok,It work, thanks. But I also have one question, when I compare the black icon,the result is awful, different black transparent icon score is 0, other color icon is fine.how can I fix it?I uesd cnn and PHash, but them give me the same result.like these. CaretUp cart-empty (2)

tanujjain commented 1 year ago

The culprit here is the convert function from PIL which is being used to convert your RGBA image to RGB. Both cnn as well as hashing methods rely on this method. At this point, I'm afraid it's not possible to get rid of this issue without removing dependency on PIL (which we wouldn't like to do). However, hopefully, it works better on other images for you.

code-killerr commented 1 year ago

What a pity, In my work, there have a lot of black icon,I can't skip them,maybe can find a way to change background color when compare.