Mintplex-Labs / vector-admin

The universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more vector databases with ease.
https://vectoradmin.com/
MIT License
1.33k stars 209 forks source link

What embeddings model is used for the uploaded documents? #61

Open algsupport opened 12 months ago

algsupport commented 12 months ago

I saw the project on the youtube. Seems excellent.

I was wondering. When uploading a new document, what is the embeddings model used to convert it?

Is it possible to chose a custom one? If so, how can it be done?

Thank you.

timothycarambat commented 12 months ago

Custom embedding models arent supported at this time just because we haven't expanded the scope yet. Right now its just the normal 1536 text-ada-embedding-002 by OpenAI. Obviously not everyone uses that and if you tried to edit or add a doc and the dimensions dont match it would stop you from doing so.

Are you using a Hugging face model for embeddings?

algsupport commented 12 months ago

Yes, I would want to use a hugging face model if possible. text-ada-embedding-002 works too, but would be more convinient to select the embeddings model.

Would you mind if I try to add it myself (Of course I will make a pull request if I succeed.)? Could you point me towards the part of the code that would be responsible for the embeddings CRUD?

Thank you

timothycarambat commented 12 months ago

It is used in several areas (job for each vectorDB at this time).

If you look for anywhere openAi.embedTextChunk or openAi.embedTextChunks is used - that is the only place embeddings are currently used!

andsty commented 10 months ago

can we have support on hugging face embeddings as well? or is not possible?

timothycarambat commented 10 months ago

You can, but it is supported via LocalAI and not via the HuggingFace API directly. Is that what you are using for embedding currently?

hasani114 commented 2 months ago

Any plan on adding this feature? Since this post, openai also released another model for embeddings. Also there are more specialized embeddings models being developed by companies like voyage ai etc which we'd like to be able to use.

Thanks!