rom1504 / clip-retrieval

Easily compute clip embeddings and build a clip retrieval system with them
https://rom1504.github.io/clip-retrieval/
MIT License
2.42k stars 213 forks source link

Support OpenCLIP models in `clip_back` #191

Closed krasserm closed 2 years ago

krasserm commented 2 years ago

This pull requests extends clip_back to support OpenCLIP models, including the latest from OpenCLIP 2.x releases (ViT-H/14, ...). For example, when running

clip-retrieval back --port 1234 --indices-paths indices_paths.json

with the following indices_paths.json file

{
    "my-index": {
        "indice_folder": "my-index-folder",
        "clip_model": "open_clip:ViT-H-14",
        "use_jit": false,
        "columns_to_return": ["image_path", "caption"],
        "enable_mclip_option": false,
        "enable_faiss_memory_mapping": false,
        "enable_hdf5": true,
        "use_arrow": false,
        "reorder_metadata_by_ivf_index": false,
        "provide_safety_model": false,
        "provide_violence_detector": false,
        "provide_aesthetic_embeddings": false
    }
}

then queries are encoded with the recently released OpenCLIP ViT-H-14 model. Current limitation is that using OpenCLIP models require provide_safety_model, provide_violence_detector and provide_aesthetic_embeddings to be set to false.

krasserm commented 2 years ago

I've just seen that this PR isn't compatible with the recent commits on main from today. Will fix it and push an update soon ...

krasserm commented 2 years ago

This should now work with the recent autocast additions. Appreciate any guidance to improve things further.

rom1504 commented 2 years ago

Hi, thanks for tackling this! I agree it's an important feature.

I think would like to avoid putting a direct dependency from clip back to clip inference.

A simple way to avoid this could be to move the load_clip.py file outside of the inference folder.

krasserm commented 2 years ago

Yep, that's cleaner. Just pushed the changes.

rom1504 commented 2 years ago

Can you rebase this on main ?

krasserm commented 2 years ago

Just rebased on main.

rom1504 commented 2 years ago

Thanks

krasserm commented 2 years ago

You're welcome, thanks for merging and, of course, for your great work on clip-retrieval!