charmbracelet / mods

AI on the command line
MIT License
2.86k stars 101 forks source link

LocalAI not working - OPENAI_API_KEY required error #316

Open templehasfallen opened 3 weeks ago

templehasfallen commented 3 weeks ago

Describe the bug Using the latest version, for some reason, I cannot use my LocalAI endpoints at all. Having first carried over a configuration from an older version and then completely reset settings and only added my localai endpoint (either keeping or deleting the configuration for other apis), whatever I do, I keep getting:

ERROR   OPENAI_API_KEY  required; set the environment variable  OPENAI_API_KEY  or update  mods.yaml  through  mods --settings .

I have tried:

The behavior is the same regardless of command. If I go with mods -M, I am able to select my model and type a prompt and am later present with that error again (see attached GIF)

Setup

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Source Code Config file:

# Default model (gpt-3.5-turbo, gpt-4, ggml-gpt4all-j...).
default-model: solar
# Text to append when using the -f flag.
format-text:
  markdown: 'Format the response as markdown without enclosing backticks.'
  json: 'Format the response as json without enclosing backticks.'
# List of predefined system messages that can be used as roles.
roles:
  "default": []
  # Example, a role called `shell`:
  # shell:
  #   - you are a shell expert
  #   - you do not explain anything
  #   - you simply output one liners to solve the problems you're asked
  #   - you do not provide any explanation whatsoever, ONLY the command
# Ask for the response to be formatted as markdown unless otherwise set.
format: false
# System role to use.
role: "default"
# Render output as raw text when connected to a TTY.
raw: false
# Quiet mode (hide the spinner while loading and stderr messages for success).
quiet: false
# Temperature (randomness) of results, from 0.0 to 2.0.
temp: 1.0
# TopP, an alternative to temperature that narrows response, from 0.0 to 1.0.
topp: 1.0
# Turn off the client-side limit on the size of the input into the model.
no-limit: false
# Wrap formatted output at specific width (default is 80)
word-wrap: 80
# Include the prompt from the arguments in the response.
include-prompt-args: false
# Include the prompt from the arguments and stdin, truncate stdin to specified number of lines.
include-prompt: 0
# Maximum number of times to retry API calls.
max-retries: 5
# Your desired level of fanciness.
fanciness: 10
# Text to show while generating.
status-text: Generating
# Theme to use in the forms. Valid units are: 'charm', 'catppuccin', 'dracula', and 'base16'
theme: charm
# Default character limit on input to model.
max-input-chars: 12250
# Maximum number of tokens in response.
# max-tokens: 100
# Aliases and endpoints for OpenAI compatible REST API.
apis:
  localai:
    # LocalAI setup instructions: https://github.com/go-skynet/LocalAI#example-use-gpt4all-j-model
    base-url: http://10.13.37.25:8080
    models:
      solar-10.7b-instruct-v1.0.Q5_K_M.gguf:
        aliases: ["solar"]
        max-input-chars: 12250
        fallback:

Alternatively:

# Default model (gpt-3.5-turbo, gpt-4, ggml-gpt4all-j...).
default-model: solar
# Text to append when using the -f flag.
format-text:
  markdown: 'Format the response as markdown without enclosing backticks.'
  json: 'Format the response as json without enclosing backticks.'
# List of predefined system messages that can be used as roles.
roles:
  "default": []
  # Example, a role called `shell`:
  # shell:
  #   - you are a shell expert
  #   - you do not explain anything
  #   - you simply output one liners to solve the problems you're asked
  #   - you do not provide any explanation whatsoever, ONLY the command
# Ask for the response to be formatted as markdown unless otherwise set.
format: false
# System role to use.
role: "default"
# Render output as raw text when connected to a TTY.
raw: false
# Quiet mode (hide the spinner while loading and stderr messages for success).
quiet: false
# Temperature (randomness) of results, from 0.0 to 2.0.
temp: 1.0
# TopP, an alternative to temperature that narrows response, from 0.0 to 1.0.
topp: 1.0
# Turn off the client-side limit on the size of the input into the model.
no-limit: false
# Wrap formatted output at specific width (default is 80)
word-wrap: 80
# Include the prompt from the arguments in the response.
include-prompt-args: false
# Include the prompt from the arguments and stdin, truncate stdin to specified number of lines.
include-prompt: 0
# Maximum number of times to retry API calls.
max-retries: 5
# Your desired level of fanciness.
fanciness: 10
# Text to show while generating.
status-text: Generating
# Theme to use in the forms. Valid units are: 'charm', 'catppuccin', 'dracula', and 'base16'
theme: charm
# Default character limit on input to model.
max-input-chars: 12250
# Maximum number of tokens in response.
# max-tokens: 100
# Aliases and endpoints for OpenAI compatible REST API.
apis:
  openai:
    base-url: https://api.openai.com/v1
    api-key:
    api-key-env: API_KEY_IS_HERE_REDACTED
    models: # https://platform.openai.com/docs/models
      gpt-4o-mini:
        aliases: ["4o-mini"]
        max-input-chars: 392000
        fallback: gpt-4o
      gpt-4o:
        aliases: ["4o"]
        max-input-chars: 392000
        fallback: gpt-4
      gpt-4:
        aliases: ["4"]
        max-input-chars: 24500
        fallback: gpt-3.5-turbo
      gpt-4-1106-preview:
        aliases: ["128k"]
        max-input-chars: 392000
        fallback: gpt-4
      gpt-4-32k:
        aliases: ["32k"]
        max-input-chars: 98000
        fallback: gpt-4
      gpt-3.5-turbo:
        aliases: ["35t"]
        max-input-chars: 12250
        fallback: gpt-3.5
      gpt-3.5-turbo-1106:
        aliases: ["35t-1106"]
        max-input-chars: 12250
        fallback: gpt-3.5-turbo
      gpt-3.5-turbo-16k:
        aliases: ["35t16k"]
        max-input-chars: 44500
        fallback: gpt-3.5
      gpt-3.5:
        aliases: ["35"]
        max-input-chars: 12250
        fallback:
  localai:
    # LocalAI setup instructions: https://github.com/go-skynet/LocalAI#example-use-gpt4all-j-model
    base-url: http://10.13.37.25:8080
    models:
      solar-10.7b-instruct-v1.0.Q5_K_M.gguf:
        aliases: ["solar"]
        max-input-chars: 12250
        fallback:

Expected behavior OPENAI_KEY should be ignored if set or not if using LocalAI as API.

Screenshots Behavior as per the second config file: alacritty_0GLWIEQn4F

Additional context I'm not sure if I am missing something, but even having generated a fresh config and having looked a the code I see two issues:

  1. There is no case for localai (it defaults to openai) codeblock
  2. Even if the openai_key is set, it is not detected?
sedlund commented 3 weeks ago

maybe this was a typo:

api-key-env: API_KEY_IS_HERE_REDACTED

but this should have the environment variable used to find the openai api key if api-key is not set. default is this:

    api-key:
    api-key-env: OPENAI_API_KEY
templehasfallen commented 3 weeks ago

maybe this was a typo:

api-key-env: API_KEY_IS_HERE_REDACTED

but this should have the environment variable used to find the openai api key if api-key is not set. default is this:

    api-key:
    api-key-env: OPENAI_API_KEY

You are right. I've now set it to OPENAI_API_KEY and set the env variable OPENAI_API_KEY to my openai API key (which I do not wish to use), but the result is exactly the same:

alacritty_FYS7jSWrCG

apis:
  openai:
    base-url: https://api.openai.com/v1
    api-key:
    api-key-env: OPENAI_API_KEY
    models: # https://platform.openai.com/docs/models
mrex commented 1 day ago

I can somewhat confirm this behaviour. Even when removing all API configurations other than the one for LocalAI I get asked for an OpenAI API key (which I do not have). I guess that's because OpenAI is the default API configuration (mods.go, in the switch mod.API block, it calls ensureKey to make sure the environment variable is set).

I think the environment variable does not need to hold your actual OpenAI API key, any value should do to pass this check. I got past this point by setting OPENAI_API_KEY to a random value, and it looks like mods is making a request to the configured API endpoint now. There's now answer displayed after the "Generating" message finishes, but I am not sure if that is a problem with mods or with my setup of LocalAI. The link they give for setup (https://github.com/mudler/LocalAI#example-use-gpt4all-j-model) is dead, so I'm not sure if my LocalAI is set up correctly.