Open zilto opened 3 weeks ago
@Pipboyguy .... Just adding some colour here pertaining to this slack thread. I had been in touch with @zilto ... since it was originally some questions around his blog.
I wanted to be able to swap-out his openai usage for calculating embeddings on a self-hosted ollama embeddings model and just have them calculated locally.
I tried with these settings.
# using Ollama
os.environ["DESTINATION__LANCEDB__EMBEDDING_MODEL_PROVIDER"] = "ollama"
os.environ["DESTINATION__LANCEDB__EMBEDDING_MODEL"] = "nomic-embed-text"
This errored with:
PipelineStepFailed: Pipeline execution failed at stage sync with exception:
<class 'dlt.common.configuration.exceptions.ConfigValueCannotBeCoercedException'>
Configured value for field embedding_model_provider cannot be coerced into type typing.Literal['gemini-text', 'bedrock-text', 'cohere', 'gte-text', 'imagebind', 'instructor', 'open-clip', 'openai', 'sentence-transformers', 'huggingface', 'colbert']
Is that somehow due to ollama
missing from this list: https://github.com/dlt-hub/dlt/blob/devel/dlt/destinations/impl/lancedb/configuration.py#L50-L62
Also, since I'm running my ollama at host 192.168.192.3:11434
, it would be great to be able to pass through an ollama host that doesn't just default to localhost:11434
.
Perhaps that needs handling for something like os.environ["DESTINATION__LANCEDB__EMBEDDING_MODEL_PROVIDER_HOST"] = "<some-url>:11434"
@Pipboyguy I saw you were having some more discussions on slack about this - did you already start working on this? We had assigned Rahul to look into it, but if you're already on it I'll unassign it.
Hi @akelad . I haven't started on this. Rahul can have a go!
@zilto @Analect Turns out that the ollama provider has been added in the devel branch, and I tested it so seems to be working just fine. Please try again with:
pip install "git+https://github.com/dlt-hub/dlt.git@devel[lancedb]"
Once this PR is merged you should be able to also specify your ollama host with
os.environ["DESTINATION__LANCEDB__EMBEDDING_MODEL_PROVIDER_HOST"] = "http://192.168.192.3:11434"
Don't forget the protocol
Documentation description
User contacted me with the following question:
As far as I know, the LanceDB destination leverages the LanceDB registry and should support Ollama. The docs could more explicitly mention Ollama support (or not) and show how to set it up. In particular, what's the config key to set the Ollama server URL.
Are you a dlt user?
Yes, I'm already a dlt user.