jhnc / findimagedupes

Finds visually similar or duplicate images
GNU General Public License v3.0
100 stars 8 forks source link

honour EXIF Orientation #3

Open jhnc opened 4 years ago

jhnc commented 4 years ago

Fingerprint generation should take account of EXIF orientation metadata.

graphicsmagick supports auto-orient but not via its perl interface. I have requested an enhancement: https://sourceforge.net/p/graphicsmagick/feature-requests/57/

If this is not made, a user-contributed patch is available that could be used.

Alternatively, imagemagick version 7 provides auto-orient. Given #2 and that imagemagick development seems more active, it may be cleaner to switch back to using imagemagick and drop use of graphicsmagick once distributions start offering im7 (currently most offer im6).

jhnc commented 4 years ago

So far, no response from graphicsmagick maintainers.

dscheffy commented 3 years ago

This is an interesting request, but it actually brings into question what constitutes a duplicate, and in the end that should will depend on what the user is trying to do, so it should be user defined. It sounds like you're suggesting that an image that has been rotated clockwise 90 degrees, but is otherwise the same exact source image should be treated as a completely different image. In my case though, I'd actually like to find those duplicates so that I can get rid of the image that has the "wrong" orientation. In some situations though, I'd like to go even further and identify duplicates that were "hard" rotated -- as in the image data was actually rotated and not just the metadata (because depending on how you perform the rotation, the tool might actually re-encode the data in a new transposed matrix).

I'm going through my own last 20 years of backed up and rebacked up photos at the moment, so deduplicating them has been on my mind. I was thinking that a basic rotation invariant option might be nice -- rather than using a 16x16 grid, you could use a 32x32 grid and XOR all of the top left quadrants of the 4 rotations (as well as the for rotations of a mirror image even). That would give you the same 256 bit fingerprint -- it would represent something slightly different, but if you're interested in flip/90 degree rotationally invariant matches, it could make sense.