jhc13 / taggui

Tag manager and captioner for image datasets
GNU General Public License v3.0
495 stars 26 forks source link

IndexError: index out of range by exifread.process_file #175

Closed geroldmeisinger closed 1 month ago

geroldmeisinger commented 1 month ago
Traceback (most recent call last):
  File "~/taggui/taggui/run_gui.py", line 47, in <module>
    raise exception
  File "~/taggui/taggui/run_gui.py", line 37, in <module>
    run_gui()
  File "~/taggui/taggui/run_gui.py", line 23, in run_gui
    main_window = MainWindow(app)
                  ^^^^^^^^^^^^^^^
  File "~/taggui/taggui/widgets/main_window.py", line 177, in __init__
    self.restore()
  File "~/taggui/taggui/widgets/main_window.py", line 571, in restore
    self.load_directory(
  File "~/taggui/taggui/widgets/main_window.py", line 209, in load_directory
    self.image_list_model.load_directory(path)
  File "~/taggui/taggui/models/image_list_model.py", line 130, in load_directory
    exif_tags = exifread.process_file(
                ^^^^^^^^^^^^^^^^^^^^^^
  File "~/taggui/venv/lib/python3.11/site-packages/exifread/__init__.py", line 137, in process_file
    offset, endian, fake_exif = _determine_type(fh)
                                ^^^^^^^^^^^^^^^^^^^
  File "~/taggui/venv/lib/python3.11/site-packages/exifread/__init__.py", line 114, in _determine_type
    offset, endian, fake_exif = find_jpeg_exif(fh, data, fake_exif)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/taggui/venv/lib/python3.11/site-packages/exifread/jpeg.py", line 124, in find_jpeg_exif
    base = _get_base(base, data)
           ^^^^^^^^^^^^^^^^^^^^^
  File "~/taggui/venv/lib/python3.11/site-packages/exifread/jpeg.py", line 70, in _get_base
    logger.debug("  Length: 0x%X 0x%X", ord_(data[base + 2]), ord_(data[base + 3]))
                                             ~~~~^^^^^^^^^^
IndexError: index out of range

error

with this file: 000029685

a simple catch-all should fix it:

except Exception as exception:

the app crashes. when i restart it tries to load the same directory again and crashes. afterwards it works again but all settings are deleted.

(btw I think you should sort the file list before processing, makes it a little more predictable.)

geroldmeisinger commented 1 month ago

sort the file list before processing

LLM:

You can use the pathlib module in Python to recursively get all file paths in a directory and sort them. Here's a complete example:

from pathlib import Path

def get_all_file_paths_sorted(directory):
    # Get all file paths recursively
    file_paths = [p for p in Path(directory).rglob('*') if p.is_file()]

    # Sort the file paths
    file_paths_sorted = sorted(file_paths, key=lambda p: str(p))

    return file_paths_sorted

Python's set type does not maintain order, but you can sort the items before storing them in a set. However, once the items are in the set, they will not be ordered. To maintain a sorted set-like behavior, you can use sortedcontainers.SortedSet from the sortedcontainers library, which maintains sorted order. file_paths = SortedSet(str(p) for p in Path(directory).rglob('*') if p.is_file())

jhc13 commented 1 month ago

sort the file list before processing

Maybe I'll consider this when I deal with #181.