rom1504 / clip-retrieval

Easily compute clip embeddings and build a clip retrieval system with them
https://rom1504.github.io/clip-retrieval/
MIT License
2.25k stars 203 forks source link

A possible solution to solve KeyError: 'xxx.txt' when using “clip-retrieval inference” command #352

Open ShuxunoO opened 5 months ago

ShuxunoO commented 5 months ago

I met the same error as https://github.com/rom1504/clip-retrieval/issues/345 when I used clip-retrieval inference command to extract images and corresponding texts features, my command is like following:

clip-retrieval inference \
--input_dataset /path/to/local/img-txt dataset \
--output_folder /path/to/local/embeddings \
--input_format files \
--enable_text True \
--enable_image True \
--clip_model open_clip:ViT-L-14//path/to/local/model.pt

My local directory structure is as follows:

/xxx/BAYC
        BoredApeYachtClub_0.png   BoredApeYachtClub_0.txt   
        BoredApeYachtClub_11.png   BoredApeYachtClub_11.txt
        BoredApeYachtClub_12.png   BoredApeYachtClub_12.txt
        BoredApeYachtClub_13.png  BoredApeYachtClub_13.txt
        BoredApeYachtClub_16.png   BoredApeYachtClub_16.txt
        BoredApeYachtClub_17.png  BoredApeYachtClub_17.txt
                                ……

and the output traceback is:

Traceback (most recent call last): File "/xxx/anaconda3/envs/it-retrieval/bin/clip-retrieval", line 8, in sys.exit(main()) File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/clip_retrieval/cli.py", line 18, in main fire.Fire( File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, kwargs) File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/clip_retrieval/clip_inference/main.py", line 155, in main distributor() File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/clip_retrieval/clip_inference/distributor.py", line 17, in call worker( File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/clip_retrieval/clip_inference/worker.py", line 127, in worker runner(task) File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/clip_retrieval/clip_inference/runner.py", line 39, in call batch = iterator.next() File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/clip_retrieval/clip_inference/reader.py", line 225, in iter for batch in self.dataloader: File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 633, in next data = self._next_data() File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1345, in _next_data return self._process_data(data) File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1371, in _process_data data.reraise() File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/torch/_utils.py", line 644, in reraise raise exception KeyError: Caught KeyError in DataLoader worker process 0. Original Traceback (most recent call last): File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop data = fetcher.fetch(index) File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/site-packages/clip_retrieval/clip_inference/reader.py", line 99, in getitem image_file = self.image_files[key] KeyError: 'BoredApeYachtClub_0.txt'**

Traceback (most recent call last):0 File "", line 1, in File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/multiprocessing/spawn.py", line 126, in _main self = reduction.pickle.load(from_parent) File "/xxx/anaconda3/envs/it-retrieval/lib/python3.10/multiprocessing/synchronize.py", line 110, in setstate self._semlock = _multiprocessing.SemLock._rebuild(*state) FileNotFoundError: [Errno 2] No such file or directory

——————————————————————————————————————————————————————————

def __getitem__(self, ind):
  key = self.keys[ind]
  output = {}

  if self.enable_image:
    image_file = self.image_files[key]
    try:
      image_tensor = self.image_transform(Image.open(image_file))
                      ……

After my analysis, I think the problem is that the file suffix ".txt" in "key" at this location in the code causes an issue in finding the corresponding file in the image dictionary. This is because in the source code, the possible image file extensions are: ".png", ".jpg", ".jpeg", ".bmp", ".webp", ".PNG", ".JPG", ".JPEG", ".BMP", ".WEBP".

To elaborate further, the function folder_to_keys(folder, enable_text=True, enable_image=True, enable_metadata=False) at this location in the code incorrectly uses filenames with suffixes as keys while constructing the dictionaries "text_files", "image_files", and "metadata_files". In fact, it should only retain the filename (removing the suffix). Here is my modified version of the code:

def folder_to_keys(folder, enable_text=True, enable_image=True, enable_metadata=False):
    """returns a list of keys from a folder of images and text"""
    path = Path(folder)
    text_files = None
    metadata_files = None
    image_files = None
    if enable_text:
        text_files = [*path.glob("**/*.txt")]
        text_files = {text_file.relative_to(path).with_suffix('').as_posix(): text_file for text_file in text_files}
    if enable_image:
        image_files = [
            *path.glob("**/*.png"),
            *path.glob("**/*.jpg"),
            *path.glob("**/*.jpeg"),
            *path.glob("**/*.bmp"),
            *path.glob("**/*.webp"),
            *path.glob("**/*.PNG"),
            *path.glob("**/*.JPG"),
            *path.glob("**/*.JPEG"),
            *path.glob("**/*.BMP"),
            *path.glob("**/*.WEBP"),
        ]
        image_files = {image_file.relative_to(path).with_suffix('').as_posix(): image_file for image_file in image_files}
    if enable_metadata:
        metadata_files = [*path.glob("**/*.json")]
        metadata_files = {metadata_file.relative_to(path).with_suffix('').as_posix(): metadata_file for metadata_file in metadata_files}

    keys = None

    def join(new_set):
        return new_set & keys if keys is not None else new_set

    if enable_text:
        keys = join(text_files.keys())
    if enable_image:
        keys = join(image_files.keys())
    if enable_metadata:
        keys = join(metadata_files.keys())

    keys = list(sorted(keys))

    return keys, text_files, image_files, metadata_files

After modifying the code, the inference process went smoothly and I successfully obtained the corresponding feature vectors for both images and texts.

image

I hope this can help the users with the same errors!

rom1504 commented 5 months ago

Can you read https://github.com/rom1504/clip-retrieval/pull/329 and propose a fix that make things work without breaking what this PR had fixed ?

ShuxunoO commented 5 months ago

Can you read #329 and propose a fix that make things work without breaking what this PR had fixed ?

Sure~ The settings of my local folder and the output of the command line:

94a06f45e2eff58977b18330f5291e0

the output is: image

This is reasonable because the code uses proxy paths relative to the root directory, resulting in all dictionary keys containing subdirectories of different levels.

text_files = {text_file.relative_to(path).as_posix(): text_file for text_file in text_files}

ShuxunoO commented 5 months ago

Can you read #329 and propose a fix that make things work without breaking what this PR had fixed ?

should I make a PR again?