rom1504 / clip-retrieval

Easily compute clip embeddings and build a clip retrieval system with them
https://rom1504.github.io/clip-retrieval/
MIT License
2.42k stars 213 forks source link

OSError: Image file is truncated #257

Closed juliawilkins closed 10 months ago

juliawilkins commented 1 year ago

Hi, I am using clip inference using the following command:

clip-retrieval inference --input_dataset datasetname --output_folder outname --enable_text False --write_batch_size 100000

And am hitting the following error. The error causes the embedding calculation to hang indefinitely, i.e. I'll see something like

sample_per_sec 469 ; sample_count 69632 
 sample_per_sec 453 ; sample_count 69632 
 sample_per_sec 298 ; sample_count 69632 
 sample_per_sec 259 ; sample_count 69632 
 ....

after first throwing the error below. Is there anything I can do such that the process continues when this error is thrown? I've tried loading all of my images with PIL.Image.open(fname) and can do this sucessfully, but the error seems to be on a resizing that's deep in clip-retrieval. Please let me know if there is anything I can do to bypass this. Thanks!

The error:

Traceback (most recent call last): 65536 
  File "/opt/conda/envs/avmod/bin/clip-retrieval", line 8, in <module>
    sys.exit(main())
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/clip_retrieval/cli.py", line 18, in main
    fire.Fire(
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/clip_retrieval/clip_inference/main.py", line 154, in main
    distributor()
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/clip_retrieval/clip_inference/distributor.py", line 17, in __call__
    worker(
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/clip_retrieval/clip_inference/worker.py", line 122, in worker
    runner(task)
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/clip_retrieval/clip_inference/runner.py", line 39, in __call__
    batch = iterator.__next__()
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/clip_retrieval/clip_inference/reader.py", line 207, in __iter__
    for batch in self.dataloader:
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 628, in __next__
    data = self._next_data()
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1313, in _next_data
    return self._process_data(data)
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data
    data.reraise()
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/torch/_utils.py", line 543, in reraise
    raise exception
OSError: Caught OSError in DataLoader worker process 4.
Original Traceback (most recent call last):
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/PIL/ImageFile.py", line 249, in load
    s = read(self.decodermaxblock)
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/PIL/PngImagePlugin.py", line 957, in load_read
    cid, pos, length = self.png.read()
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/PIL/PngImagePlugin.py", line 179, in read
    length = i32(s)
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/PIL/_binary.py", line 85, in i32be
    return unpack_from(">I", c, o)[0]
struct.error: unpack_from requires a buffer of at least 4 bytes for unpacking 4 bytes at offset 0 (actual buffer size is 0)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
    data = fetcher.fetch(index)
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 58, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/clip_retrieval/clip_inference/reader.py", line 91, in __getitem__
    image_tensor = self.image_transform(Image.open(image_file))
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/torchvision/transforms/transforms.py", line 95, in __call__
    img = t(img)
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/torchvision/transforms/transforms.py", line 346, in forward
    return F.resize(img, self.size, self.interpolation, self.max_size, self.antialias)
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/torchvision/transforms/functional.py", line 474, in resize
    return F_pil.resize(img, size=output_size, interpolation=pil_interpolation)
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/torchvision/transforms/functional_pil.py", line 252, in resize
    return img.resize(tuple(size[::-1]), interpolation)
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/PIL/Image.py", line 2156, in resize
    self.load()
  File "/opt/conda/envs/avmod/lib/python3.8/site-packages/PIL/ImageFile.py", line 256, in load
    raise OSError(msg) from e
OSError: image file is truncated
BIGBALLON commented 11 months ago

@juliawilkins Hello, I also encountered the same problem. Did you find the reason or corresponding solution later? Thank you.

[UPDATE]

from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

This method helps resolve the issue

rom1504 commented 10 months ago

this is fixed now