Closed lagerspetz closed 4 years ago
Hi!
Please feel free to send any pull request you like, your fixes for wrong JPGs are great, and I was thinking about an automatic mode for a long time, your criteria (the one with most tags, or shorter path) sounds good to me.
I don't see the point in doing whole-file MD5 comparisons, though, since there exist other utilities (like fdupes) which do this job just fine, and I think it would be reinventing the wheel a little bit. My way of using imgdupes is executing fdupes first and, when I'm sure there're no whole file duplicates, using imgdupes to detect re-tagged duplicates and such. fdupes command is much more mature than imgdupes, and hence probably smarter and faster than anything I could code for achieving the same result, so I'm happy having both of them as separate tools and not overlapping functionality.
Thanks for your work!
Hi, My reason for using imgdupes only is avoiding extra work, if the sig cache exists, re-running imgdupes is quick when some new files are added. I don't know if fdupes allows this, I never tried it. If there is a library for it, I could just use that first, then compare jpeg data blocks only for the non dupe jpegs that method finds.all of this would then be in the signature cache, so when I sync new photos from various computers with Shotwell, I can very quickly just compute the hashes of the new files and eliminate dupes.
Eemil LagerspetzSent from my Samsung device
-------- Original message -------- From: Jesus Jimenez notifications@github.com Date: 07/01/2016 16:32 (GMT+02:00) To: jesjimher/imgdupes imgdupes@noreply.github.com Cc: Eemil Lagerspetz eemil.lagerspetz@gmail.com Subject: Re: [imgdupes] Have you thought about doing whole-file MD5 for other image types such as png and nef? (#4)
Hi!
Please feel free to send any pull request you like, your fixes for wrong JPGs are great, and I was thinking about an automatic mode for a long time, your criteria (the one with most tags, or shorter path) sounds good to me.
I don't see the point in doing whole-file MD5 comparisons, though, since there exist other utilities (like fdupes) which do this job just fine, and I think it would be reinventing the wheel a little bit. My way of using imgdupes is executing fdupes first and, when I'm sure there're no whole file duplicates, using imgdupes to detect re-tagged duplicates and such. fdupes command is much more mature than imgdupes, and hence probably smarter and faster than anything I could code for achieving the same result, so I'm happy having both of them as separate tools and not overlapping functionality.
Thanks for your work!
— Reply to this email directly or view it on GitHub.
Have you thought about doing whole-file MD5 for other image types such as png and nef?
I have forked your code and done some adjustments. I have some files that crash imgdupes, because they have "truncated jpg block" data. Also, imgdupes seems to show the same file multiple times for HDR files re-developed by shotwell. Then choosing one to keep fails with the error that it cannot delete the extras, e.g.
If you are still interested in this project, I'm planning to send you some PRs for: