Closed philippgille closed 4 months ago
Their API seems to be similar to the OpenAI one. I think they use vLLM as the inference/serving engine, which has an OpenAI compatible endpoint (or at least an option to use that, in addition to a non-compatible one?). So I tried with the following mods settings:
apis:
openai:
// ...
localai:
// ...
mistral:
base-url: https://api.mistral.ai/v1
api-key-env: MISTRAL_API_KEY
models:
mistral-medium:
aliases: ["medium"]
fallback: mistral-small
mistral-small:
aliases: ["small"]
fallback: mistral-tiny
mistral-tiny:
aliases: ["tiny"]
// ...
And then running with mods -a mistral -m tiny '...'
, but it printed an OpenAI response for an empty chat message, and ran into an error when saving the conversation ("There was a problem writing 123abc... to the cache. Use --no-cache / NO_CACHE to disable it.")
It's unclear though what exactly the error is. Did it not find the configured API/model in the settings? Was there an error response from the Mistral API and then it fell back to OpenAI? It doesn't print any details, and mods has no --verbose
or similar flag to do this.
I do this and ollama (which can be mistral) ran fine for me in doing some local scripting..
ollama:
# LocalAI setup instructions: https://github.com/go-skynet/LocalAI#example-use-gpt4all-j-model
base-url: http://localhost:11434
models:
llama2-13b:
aliases: ['llama2-13b', 'ollama']
max-input-chars: 12250
Are there any plans to support mistral api ?
You can access mistral through groq
:
groq:
base-url: https://api.groq.com/openai/v1
api-key-env: GROQ_API_KEY
models:
mixtral-8x7b-32768:
aliases: ["mixtral", 8x7b]
max-input-chars: 98000
llama2-70b-4096:
aliases: ["llama2"]
max-input-chars: 12250
using through groq works, yes, ollama should work as well.
pretty much any model with a openai-compatible api should work ✌🏻
using through groq works, yes, ollama should work as well.
Mistral has some open, but also some proprietary models which are only accessible via their API, and not through Groq or Ollama. I created this issue to access those models via mods.
pretty much any model with a openai-compatible api should work ✌🏻
Their API is OpenAI-compatible, that's why I tried with just this config, as mentioned in a previous comment:
mistral:
base-url: https://api.mistral.ai/v1
api-key-env: MISTRAL_API_KEY
models:
mistral-medium:
aliases: ["medium"]
fallback: mistral-small
...
That doesn't work.
But I just found out why it doesn't work. Even with a max-input-chars
default value at the root level of the config, mods seems to require each model to also have its own max-input-chars
and that was missing in this Mistral config section.
It's hard to tell because there's no error message indicating this. Instead it seems mods sends an empty query to the provider (based on the garbage response unrelated to the query) and then after printing the response it prints an error that it can't store the conversation in the cache.
=> I'll create one PR to add Mistral to the config template, as resolution to this issue. And I'll create another issue for the unexpected max-input-chars
behavior.
fixed in #305
Mistral (known for their 7B model and more recently their Mixture of Experts model) have recently started offering an API: https://docs.mistral.ai/api/
It would be great if it could be used with mods.
This is similar to the feature request for supporting Ollama (https://github.com/charmbracelet/mods/issues/162) and supporting llamafile (https://github.com/charmbracelet/mods/issues/168).
Maybe one solution could be to use a library that offers an abstraction over all the different LLM APIs already? There's the Go port for LangChain, which supports various LLMs as backend already for example:
But it doesn't support the Mistral API yet.