charmbracelet / mods

AI on the command line
MIT License
2.95k stars 113 forks source link

Support Mistral API #174

Closed philippgille closed 2 months ago

philippgille commented 9 months ago

Mistral (known for their 7B model and more recently their Mixture of Experts model) have recently started offering an API: https://docs.mistral.ai/api/

It would be great if it could be used with mods.

This is similar to the feature request for supporting Ollama (https://github.com/charmbracelet/mods/issues/162) and supporting llamafile (https://github.com/charmbracelet/mods/issues/168).

Maybe one solution could be to use a library that offers an abstraction over all the different LLM APIs already? There's the Go port for LangChain, which supports various LLMs as backend already for example:

But it doesn't support the Mistral API yet.

philippgille commented 9 months ago

Their API seems to be similar to the OpenAI one. I think they use vLLM as the inference/serving engine, which has an OpenAI compatible endpoint (or at least an option to use that, in addition to a non-compatible one?). So I tried with the following mods settings:

apis:
  openai:
    // ...
  localai:
    // ...
  mistral:
    base-url: https://api.mistral.ai/v1
    api-key-env: MISTRAL_API_KEY
    models:
      mistral-medium:
        aliases: ["medium"]
        fallback: mistral-small
      mistral-small:
        aliases: ["small"]
        fallback: mistral-tiny
      mistral-tiny:
        aliases: ["tiny"]
// ...

And then running with mods -a mistral -m tiny '...', but it printed an OpenAI response for an empty chat message, and ran into an error when saving the conversation ("There was a problem writing 123abc... to the cache. Use --no-cache / NO_CACHE to disable it.")

It's unclear though what exactly the error is. Did it not find the configured API/model in the settings? Was there an error response from the Mistral API and then it fell back to OpenAI? It doesn't print any details, and mods has no --verbose or similar flag to do this.

sheldonhull commented 8 months ago

I do this and ollama (which can be mistral) ran fine for me in doing some local scripting..

  ollama:
    # LocalAI setup instructions: https://github.com/go-skynet/LocalAI#example-use-gpt4all-j-model
    base-url: http://localhost:11434
    models:
      llama2-13b:
        aliases: ['llama2-13b', 'ollama']
        max-input-chars: 12250
chiefMarlin commented 6 months ago

Are there any plans to support mistral api ?

cloudbridgeuy commented 5 months ago

You can access mistral through groq:

  groq:
    base-url: https://api.groq.com/openai/v1
    api-key-env: GROQ_API_KEY
    models:
      mixtral-8x7b-32768:
        aliases: ["mixtral", 8x7b]
        max-input-chars: 98000
      llama2-70b-4096:
        aliases: ["llama2"]
        max-input-chars: 12250
caarlos0 commented 3 months ago

using through groq works, yes, ollama should work as well.

pretty much any model with a openai-compatible api should work ✌🏻

philippgille commented 2 months ago

using through groq works, yes, ollama should work as well.

Mistral has some open, but also some proprietary models which are only accessible via their API, and not through Groq or Ollama. I created this issue to access those models via mods.

pretty much any model with a openai-compatible api should work ✌🏻

Their API is OpenAI-compatible, that's why I tried with just this config, as mentioned in a previous comment:

  mistral:
    base-url: https://api.mistral.ai/v1
    api-key-env: MISTRAL_API_KEY
    models:
      mistral-medium:
        aliases: ["medium"]
        fallback: mistral-small
      ...

That doesn't work.


But I just found out why it doesn't work. Even with a max-input-chars default value at the root level of the config, mods seems to require each model to also have its own max-input-chars and that was missing in this Mistral config section.

It's hard to tell because there's no error message indicating this. Instead it seems mods sends an empty query to the provider (based on the garbage response unrelated to the query) and then after printing the response it prints an error that it can't store the conversation in the cache.

=> I'll create one PR to add Mistral to the config template, as resolution to this issue. And I'll create another issue for the unexpected max-input-chars behavior.

caarlos0 commented 2 months ago

fixed in #305