Open dillfrescott opened 7 months ago
and there are proxy servers which will turn any api into an openai compatible one
I get an error nomic-embed-text is not found https://github.com/nilsherzig/LLocalSearch/issues/64 when subbing to an openai compatible api like these:
was exactly my question - what is the suggested way of using custom model / agent
The current WebUI gets its model list (model switcher in the top left) from the /models endpoint of the API. If you want to run a different model than the default one, you would just have to load the model onto ollama @d0rc.
@BarfingLemurs the embeddings model (used to create embeddings from the website texts) is currently hard-coded (I'm open to make this a configuration option). Could you try loading it onto one of the tools you've mentioned and report back if that just works?
You can now configure the embeddings model name using env vars
I don't know if other backends come with any such additions, honestly. Does ollama come with a vector database engine this project is looking for? Could I install it separately? I have been using the exllama backend, which have fast prompt processing speeds.
Log: https://pastebin.com/uJgJRXz7 (TabbyAPI https://github.com/theroyallab/tabbyAPI)
(Exllamav2 https://github.com/turboderp/exllamav2)
I don't know if other backends come with any such additions, honestly. Does ollama come with a vector database engine this project is looking for? Could I install it separately? I have been using the exllama backend, which have fast prompt processing speeds.
Log: https://pastebin.com/uJgJRXz7 (TabbyAPI https://github.com/theroyallab/tabbyAPI)
(Exllamav2 https://github.com/turboderp/exllamav2)
127.0.0.1 from the view of the backend is the backends loopback address not your hosts localhost. Your logs don't indicate that exllama would not work, but that the backend has no connection to the API.
Ollama does not provide the vector DB. It's a wrapper around llama.cpp with something like a package manager for pre configured LLMs.
Looks like exllamav2 is Nvidia only? I only have an AMD card :/.
Its seems ollama supports this nomic embed text in some way, and I should deploy this but can use the main model from text-gen-webui for exllama.
text-gen-webui has many backends including llama.cpp, hence I think your card and cpu will be fine for testing. I made a mistake earlier using tabbyAPI which needs you to enter keys, text-gen-webui does not. However, I still didn't figure out how to run LLocalSearch yet.
This would be useful to use apps other than ollama, as there are tons of backend apps and servers that are openai api compatible.