elisemercury / Duplicate-Image-Finder

difPy - Python package for finding duplicate or similar images within folders
https://difpy.readthedocs.io
MIT License
421 stars 65 forks source link

gracefully skip over deleted files #17

Closed Pomax closed 2 years ago

Pomax commented 2 years ago

sort of related to #16, running this on 10,000 images while another process autowrites/moves data into and out of the same dir once an hour may cause difPy to try to load in images that existed when it built its file list, but not once it actually gets to that file. right now, that causes it to hard-crash:

Traceback (most recent call last):
  File "d:\temp\diftest.py", line 4, in <module>
    search = dif("./inbox/_reviewed")
  File "d:\temp\venv\lib\site-packages\difPy\dif.py", line 50, in __init__
    result, lower_quality = dif._search_one_dir(directory_A,
  File "d:\temp\venv\lib\site-packages\difPy\dif.py", line 113, in _search_one_dir
    high, low = dif._check_img_quality(directory_A, directory_A, filenames_A[count_A], filenames_A[count_B])
  File "d:\temp\venv\lib\site-packages\difPy\dif.py", line 261, in _check_img_quality
    size_imgA = os.stat(dirA + imageA).st_size
FileNotFoundError: [WinError 2] The system cannot find the file specified: './inbox/_reviewed\\rz3b9hl0yio81.jpg'

Instead, it should probably just go skipping rz3b9hl0yio81.jpg: could not find file (did it get moved/deleted?) and keep running

elisemercury commented 2 years ago

Hi @Pomax! Thank you for your input and for reporting this issue. DifPy only gets better thanks to users like you reporting bugs and sharing their ideas! I will work on resolving this with the next difPy update. Thank you and all the best, Elise