Closed jamescalam closed 1 year ago
Hi @jamescalam! Your suggestion works.
There is actually a minimal support for changing the embedding model, but still to a SentenceTransformer one. https://github.com/NVIDIA/NeMo-Guardrails/blob/main/nemoguardrails/actions/llm/generation.py#L73
See https://github.com/NVIDIA/NeMo-Guardrails/discussions/97 as well.
models:
...
- type: embedding
engine: SentenceTransformer
model: all-MiniLM-L6-v2
We have a work in progress on a private branch to enable the integration of any embedding and search provider. There are a few things that are changing, including the interface for EmbeddingsIndex
which needs to be async. I reckon next week we should be able to push this to the repo. And if you can, you can help with adding support for OpenAI as an EmbeddingSearchProvider
.
Thanks a lot!
@drazvan created PR here #101
Hi there
I think this approach will work in a lot of cases. But if you have a more complicated enterprise environment it may not be the best solution.
We can already pass a langchain LLM as main Model to LLMRails. Why not add a second variable to pass a langchain embedding model? This would open up a lot of new options without having to implement all of them. OpenAI Support is already good. But in an enterprise setup it will anyways most likely be AzureOpenAI...
Kind Regards Dominik
Issue was solved with #101
Issue was solved with #101
@jamescalam thanks for the contribution. Can you please elaborate or provide the configuration or code snippet where this fix supports Azure OpenAI for Embeddings?
Cheers!
so has the support for custom embedding model been added ? I have been search the repo for an example but I couldn't find anything
It will be added by #548 and be part of the 0.10.0
release.
It will be added by #548 and be part of the
0.10.0
release.
Let me change my question. Lets say i have my own qdrant db which has embeddings and its corresponding chunk, where the vector embeddings have been generated using e5-mistral-7b-instruct (4096 dim). How can I use this with guardrails ? Is it possible to configure a custom Embedding Search Provider where when the async search function is called it calls my embedding model and uses the returned embedding to search through my qdrant db and then return List of IndexItem
Hi, I'd like to propose a fix to allow us to set the embedding model used to create the vector representations of rails. My reason for this is primarily that I'd like to be able to use a service like OpenAI for ada-002 primarily as an easy fix to large container sizes when deploying anything containing guardrails code — and additionally to allow us to select different hugging face models if preferred.
I'd propose in an initial version of this to support OpenAI and Hugging Face models. But very open to suggestions — I'd also be happy to work on this. Would there be any preference on how to set an embedding model? I figure we could do something in the config.yaml like:
OR with hugging face: