[Request] Allow for using OpenAI or other 3P to generate embeddings

khoj-ai / khoj

Your AI second brain. Get answers to your questions, whether they be online or in your own notes. Use online AI models (e.g gpt4) or private, local LLMs (e.g llama3). Self-host locally or use our cloud instance. Access from Obsidian, Emacs, Desktop app, Web or Whatsapp.

https://khoj.dev

GNU Affero General Public License v3.0

12.63k stars 640 forks source link

[Request] Allow for using OpenAI or other 3P to generate embeddings #593

Open rvjosh opened 9 months ago

rvjosh commented 9 months ago

I know this is likely the opposite of the requests you all usually get on here, but I have an Obsidian vault that is ~2000 documents and it is taking a long time (40 min and counting to do the first 900 documents) to embed my vault running a Docker container locally. I see that you use the "thenlper/gte-small" model by default to do the embeddings - would you consider allowing for using 3P models to do the embeddings that are called via API such as OpenAI's text-ada-002? Thank you for the great app!

debanjum commented 8 months ago

Are you on a machine with GPU? It's pretty fast when running on GPU enabled machines.

You can choose any sentence transformer compatible model for indexing
Using the standard sentence transformer models is generally faster for search as they're smaller in size and of decently good quality. Ada's larger embedding size slows down the speed of search
For large datasets, the first run will be slower but subsequent runs should be fine as only the updated embeddings are regenerated.

We haven't seen a strong enough need to support generating embedding via API yet. But open to considering this if there's a strong enough reason to do so in the future

sabaimran commented 8 months ago

@rvjosh I've just worked on adding in support to use 3P embeddings API hosted on huggingface (if you configure it to be run as a Sentence Embedding task). See PRs #609 , #616 . This should be fairly extensible with using the OpenAI service directly (eventually), but it would also work for any model via HuggingFace now. Will be available in the next release.