arsenetar / dupeguru

Find duplicate files
https://dupeguru.voltaicideas.net
GNU General Public License v3.0
5.31k stars 413 forks source link

No Duplicates Found #1229

Open crogonint opened 4 months ago

crogonint commented 4 months ago

Describe the bug My mouth is still hanging open. I don't know what to make of this. I've been using DupeGuru for YEARS, and I've never seen anything like this. One of my favorite content creators updated some of his tokens. These two sets of tokens refuse to match, and I can't figure out why. The image is the same dimension, but the actual content is perhaps 1-2 pixels bigger in the new PNG. https://imgur.com/a/ULJ9477 Even if I set DupeGuru to match images of different sizes, it won't match these. In FACT, if I lower the filter down to 1% it will match incorrect images in the new folder, but not the old folder. It ought to match the tokens one for one, they are so so similar, but it refuses to match them. WHAT is going on here??

To Reproduce 1) Check the imgur link 2) Try to match the images

Expected behavior They ought to match, easily

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context

arsenetar commented 4 months ago

If you use an image editor you can overlay and see the "difference" between the two: Difference The closer to black the more they are the same, there are several areas that show considerable difference and many areas around the lines that are not "the same". This all combined is going to push this to not likely match due to how the visual difference is determined.

crogonint commented 4 months ago

Yeah I know, but these two images are at least 70% similar. It blows my mind that DupeGuru will start matching incorrect images in the two sets, but it refuses to match the correct ones. There are 16 separate tokens in the two packs. None of them will match their mate. They will start to match similar images, if I turn the settings down far enough, but I can't get them to match each other.

The artist in question has around 15,000 images that he has published over the last 30 years or so. Some of his packs contain duplicates on purpose. People commonly snag his images and "kitbash" them by making small changes, and publishing them online, or even just publishing his original works when they should not. Not to mention the other dozen or so token creators that have been around for some odd years less. I really need something I can depend on to find duplicates, as well as close matches. As I said, DupeGuru has been my go to solution for quite a few years. I don't know where to turn to resolve this new development.

It feels like I'm missing something or doing something wrong. Surely DupeGuru OUGHT to be able to match these, without breaking a sweat, yet it won't. Am I doing something wrong? Have I discovered some tiny little logic hole, where nearly exact images that are 1 or 2 pixels off in content size won't match if they're the same image size? I just don't know what's going on here, precisely.

glubsy commented 4 months ago

The images are clearly duplicates, but one got resized slightly. Dupeguru should definitely match them. There were similar reports in the past, so perhaps there is a bug in the matching algorithm.

@crogonint you might want to try other software like czkawka and see if that helps your use case in the meantime.

crogonint commented 4 months ago

@glubsy I've been going through other titles. As I said, I've depended on DupeGuru for YEARS. I do like bits and pieces of AntiDupl and AllDup. However, I haven't found anything else yet with as extensible of parameters for searching, as DupeGuru has. I believe there is one in Linux, it's similar to czkawka, but I don't even have Linux installed on my new machine yet.

I'm still shocked that DupeGuru won't detect these as matches. I would have thought that the "different sized images" setting would have grabbed these in an instant. I've found one other utility with a "different sized images" filter, but it has no settings whatsoever with it.

So far, none of the other utilities I have tried will match these, but then again, I haven't found any others that I would expect to. Plus, I'm a bit jaded, I love DupeGuru, and I don't really want to switch to anything else, anyway. :)

PJDude commented 4 months ago

You may try latest pre-release of Dude. It should identify these files on default settings.