rom1504 / clip-retrieval

Easily compute clip embeddings and build a clip retrieval system with them
https://rom1504.github.io/clip-retrieval/
MIT License
2.42k stars 213 forks source link

Is there a way to extend the clip-back server #287

Closed Sadeeed closed 10 months ago

Sadeeed commented 1 year ago

I am implementing this for an app and I'd like to have more info from the images is there a way for me to extend the functionality without modifying clip-retrieval itself?

rom1504 commented 1 year ago

All metadata columns are provided to the front end, so if you modify the front end to display what you want it'll work

On Fri, Jun 23, 2023, 13:51 Sadeed @.***> wrote:

I am implementing this for an app and I'd like to have more info from the images is there a way for me to extend the functionality without modifying clip-retrieval itself?

— Reply to this email directly, view it on GitHub https://github.com/rom1504/clip-retrieval/issues/287, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR437S4RDZXEYQEQNAH643XMV7KLANCNFSM6AAAAAAZROZGVA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Sadeeed commented 1 year ago

I am only using the backend API's with a custom dataset created with img2dataset with embeddings and indices created with clip-retrieval the issue is that when I create the dataset from my db it loses some information like the seed, image_url etc. I'd like to keep that info in the generated dataset assuming it can fetched later on with either the knn-service endpoint or the metadata endpoint.

is there a way to do this?

rom1504 commented 1 year ago

Yes --save_additional_columns in img2dataset and --save_metadata in clip back inference

On Fri, Jun 23, 2023, 18:36 Sadeed @.***> wrote:

I am only using the backend API's with a custom dataset created with img2dataset with embeddings and indices created with clip-retrieval the issue is that when I create the dataset from my db it loses some information like the seed, image_url etc. I'd like to keep that info in the generated dataset assuming it can fetched later on with either the knn-service endpoint or the metadata endpoint.

is there a way to do this?

— Reply to this email directly, view it on GitHub https://github.com/rom1504/clip-retrieval/issues/287#issuecomment-1604527655, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR437Q445ZPWQVK2O7VN2TXMXAYPANCNFSM6AAAAAAZROZGVA . You are receiving this because you commented.Message ID: @.***>

Sadeeed commented 1 year ago

okay thankyou

Sadeeed commented 1 year ago

I am using this command to generate a dataset with additional columns

img2dataset "csv/anything-v3.csv" --input_format="csv" --url_col "image_url" --caption_col "prompt" --save_additional_columns=["seed", "model_version"] --output_folder="test" --resize_mode="no" --output_format="files" 

but it throws this error

Traceback (most recent call last):
  File "/Users/icesoup/Documents/Source/Python/Samaritan/prompt-search/venv/bin/img2dataset", line 8, in <module>
    sys.exit(main())
  File "/Users/icesoup/Documents/Source/Python/Samaritan/prompt-search/venv/lib/python3.10/site-packages/img2dataset/main.py", line 262, in main
    fire.Fire(download)
  File "/Users/icesoup/Documents/Source/Python/Samaritan/prompt-search/venv/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/Users/icesoup/Documents/Source/Python/Samaritan/prompt-search/venv/lib/python3.10/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/Users/icesoup/Documents/Source/Python/Samaritan/prompt-search/venv/lib/python3.10/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/Users/icesoup/Documents/Source/Python/Samaritan/prompt-search/venv/lib/python3.10/site-packages/img2dataset/main.py", line 175, in download
    reader = Reader(
  File "/Users/icesoup/Documents/Source/Python/Samaritan/prompt-search/venv/lib/python3.10/site-packages/img2dataset/reader.py", line 67, in __init__
    self.column_list = self.column_list + ["caption"]
TypeError: can only concatenate str (not "list") to str
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/icesoup/.pyenv/versions/3.10.12/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/Users/icesoup/.pyenv/versions/3.10.12/lib/python3.10/multiprocessing/spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
  File "/Users/icesoup/.pyenv/versions/3.10.12/lib/python3.10/multiprocessing/synchronize.py", line 110, in __setstate__
    self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
Segmentation fault: 11

also if you could provide an example it'd be very grateful

rom1504 commented 1 year ago

--save_additional_columns=["seed", "model_version"] should be --save_additional_columns='["seed", "model_version"]' due to how bash works

Sadeeed commented 1 year ago

sorry to keep annoying you but I cant get save_metadata working and I couldn't find anything about it in the docs

im using this command clip-retrieval inference --input_dataset test --save_metadata='true' --output_folder test_embeddings I have also tried --save_metadata='yes' --save_metadata true but it throws the same error ERROR: Could not consume arg: --save_metadata=yes

rom1504 commented 1 year ago

https://github.com/rom1504/clip-retrieval#api

enable_metadata

Sadeeed commented 1 year ago

thank you once again for the help 🙇