jesjimher / imgdupes

Checks for duplicated images in a directory tree, ignoring metadata
GNU General Public License v3.0
40 stars 11 forks source link

Dupes found are actually the same file #11

Open DrupaListo-com opened 6 years ago

DrupaListo-com commented 6 years ago

Got result like this:

(all 4 files are SAME file)

.... dupes that are ok ...

./2018-09-17/IMG_20180819_193752.jpg 
 ./2018-09-17/IMG_20180819_193752.jpg 
 ./2018-09-17/IMG_20180819_193752.jpg 
 ./2018-09-17/IMG_20180819_193752.jpg 

... some more dupes that are ok ...

I don't know why but it worked perfect (detected all dupes it should have detected) except when it thought this one file to be a dupe of itself... weird.

DrupaListo-com commented 6 years ago

worst part is if one parses the dupe output by grepping for say: $ cat dupes | grep -P "^ ." # note the space after ^, we want to get the dupes from folder B

having the same file as A and B (weirdly as C and D too) - will mean we'll delete a non-dupe file - which is bad.

DrupaListo-com commented 6 years ago

more info - I got version 2.0 of the software installed via pip3 today.

DrupaListo-com commented 6 years ago

quote from https://github.com/jesjimher/imgdupes/issues/4 :

"Also, imgdupes seems to show the same file multiple times for HDR files re-developed by shotwell. "

seems like this issue here.

DrupaListo-com commented 6 years ago

finally: the image that caused this bug - was/is an all white pixels image that seems to be corrupted/not-ok in some weird way somehow - which nicely coincides with the description above: "HDR files re-developed by shotwell". In my case - the program that modified the file was "digikam" - a shotwell direct competitor.

I've just tried the original jpeg file before digikam changed its metadata and jpegdupes again output-ed:

./IMG_20180819_193752.jpg 
 ./IMG_20180819_193752.jpg 
 ./IMG_20180819_193752.jpg 
 ./IMG_20180819_193752.jpg 
DrupaListo-com commented 6 years ago

... or if not corrupted, it's at least visually an all-white image which might be causing the bug.