codelion / optillm

Optimizing inference proxy for LLMs
Apache License 2.0
1.64k stars 130 forks source link

Using llama-server issue with 'no_key' API key #61

Closed s-hironobu closed 1 month ago

s-hironobu commented 1 month ago

Symptoms

I used a llama-server with OPENAI_API_KEY='no_key', but it doesn't work: optillm.py was accessing the OpenAI server, not the llama-server.

2024-10-13 21:03:14,128 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 401 Unauthorized"
2024-10-13 21:03:14,133 - ERROR - Error processing request: Error code: 401 - {'error': {'message': 'Incorrect API key provided: no_key. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}

Root Cause

The root cause is line 300 in optillm.py: https://github.com/codelion/optillm/blob/193ab3c4d54f5f2e2c47525293bd7827b609675f/optillm.py#L300

Since setting "no_key" for OPENAI_API_KEY, the code falls into the else block and uses default_client. However, default_client doesn't have the base_url set in get_config(), causing it to access the https://api.openai.com/ with an invalid key (i.e. "no_key"), not to access the local llama-server.

Solutions

I think there are three solutions:

(1) Set a dummy API key with "sk-" prefix: Use a key with the "sk-" prefix but no actual functionality, like "sk-no-key". This would only require minimal documentation changes.

(2) Remove the prefix check: Modify optillm.py to remove the check for "sk-":

if bearer_token != "":

(3) Add a specific check for "no_key": Modify optillm.py to allow the "no_key" value explicitly:

if bearer_token != "" and (bearer_token.startswith("sk-") or bearer_token == "no_key"):
codelion commented 1 month ago

Thanks for checking it out, it should be fixed in #62