AmenRa / retriv

A Python Search Engine for Humans 🥸
MIT License
174 stars 20 forks source link

Image search #13

Closed karelin closed 1 year ago

karelin commented 1 year ago

Hi, Thank you for a nice Elastic/Pinecone replacement 🙂 A small question (or perhaps a feature request): is it possible to use different neural networks for indexing and retrieval? I mean, with CLIP model one first calculates vectors of images, and then uses second part of the same model to encode text queries.

AmenRa commented 1 year ago

Hi and thank you!

I don't have much time now, but the use case you brought up is very interesting!

I think you should be able to do that if you pre-compute the image embeddings, create a dummy collection, and set the text encoder correctly.

Here you find instructions on how to load pre-computed embeddings.

For the dummy collection, I would load something like:

collection = [
  {"id": "img_1", "text": "path/to/img_1"},
  {"id": "img_2", "text": "path/to/img_2"},
  {"id": "img_3", "text": "path/to/img_3"},
  {"id": "img_4", "text": "path/to/img_4"},
]

Let me know if you manage to make it work! :)

karelin commented 1 year ago

Thank you, I've missed that it is possible to load precomputed embeddings. Nice to know! Though, for document texts, I'd prefer to use the descriptions and tags.

AmenRa commented 1 year ago

Hi, I have some ideas for image search. Unfortunately, I do not have time to deepen into it right now. Therefore, I'm closing this for now. I'll let you know if there is any update on this. Thank you.