Closed audiomuze closed 1 month ago
I've just run it against another group of files and on this occasion difpy reported no lower quality images whereas in reality there were many instances of a smaller image and a larger, higher resolution image.
Perhaps the easiest way to illustrate would be for me to send you the image files to run against and compare results locally?
@elisemercury,just flagging in case you missed this?
Hi @audiomuze
Thanks so much for flagging these issues! They will be investigated and considered with the next difPy release.
Thanks again! Best Elise
Hi @audiomuze,
difPy v4.1.0 has been release and I would recommend testing it on your dataset to see if you can see some improvements. The new version comes with an improved comparison algorithm.
Feel free to reach out if the issue should still persist.
Thanks, Elise
@elisemercury , I've just pulled and tested your latest commit and have encountered what I assume are bugs:
running
python /home/x/git/Duplicate-Image-Finder/difPy/dif.py --directory /mnt/sdc/2tag/ --output_directory /tmp --recursive True --limit_extensions True --show_progress True
:Edited extract from /tmp/difPy_20230927222221_lower_quality.json:
if there are x identical (i.e. their md5sum is identical) files of lower quality in the same folder and one of superior quality, difpy only flags one of the lower quality files rather than all of them
as an observation: perusal of stats.json shows many instances of
"ImageFilterWarning: invalid image extension."
signifying to me that these non-image files are still being assessed rather than behaving according to the--limit_extensions True
switch shown above. Thus it looks like there's a further opportunity to enhance performance by ignoring non-image extensions.