BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
14.6k stars 1.71k forks source link

[Bug]: Update Key DB Call failed to execute - can only concatenate str (not "int") to str #6641

Open xingyaoww opened 3 weeks ago

xingyaoww commented 3 weeks ago

What happened?

I was running litellm-proxy ghcr.io/berriai/litellm:main-v1.52.0.dev20, and got the following errors

Relevant log output

litellm-7c566b99f6-lnxmz litellm Traceback (most recent call last):
litellm-7c566b99f6-lnxmz litellm   File "/usr/local/lib/python3.11/site-packages/litellm/proxy/proxy_server.py", line 1015, in _update_team_db
litellm-7c566b99f6-lnxmz litellm     raise e
litellm-7c566b99f6-lnxmz litellm   File "/usr/local/lib/python3.11/site-packages/litellm/proxy/proxy_server.py", line 995, in _update_team_db
litellm-7c566b99f6-lnxmz litellm     response_cost
litellm-7c566b99f6-lnxmz litellm TypeError: can only concatenate str (not "float") to str
litellm-7c566b99f6-lnxmz litellm Task exception was never retrieved
litellm-7c566b99f6-lnxmz litellm future: <Task finished name='Task-562891' coro=<update_cache() done, defined at /usr/local/lib/python3.11/site-packages/litellm/proxy/proxy_server.py:1060> exception=TypeError("unsupported operand type(s) for +: 'float' and 'str'")>
litellm-7c566b99f6-lnxmz litellm Traceback (most recent call last):
litellm-7c566b99f6-lnxmz litellm   File "/usr/local/lib/python3.11/site-packages/litellm/proxy/proxy_server.py", line 1290, in update_cache
litellm-7c566b99f6-lnxmz litellm     await _update_key_cache(token=token, response_cost=response_cost)
litellm-7c566b99f6-lnxmz litellm   File "/usr/local/lib/python3.11/site-packages/litellm/proxy/proxy_server.py", line 1096, in _update_key_cache
litellm-7c566b99f6-lnxmz litellm     new_spend = existing_spend + response_cost
litellm-7c566b99f6-lnxmz litellm                 ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
litellm-7c566b99f6-lnxmz litellm TypeError: unsupported operand type(s) for +: 'float' and 'str'
litellm-7c566b99f6-lnxmz litellm INFO:     10.5.0.3:35606 - "POST /chat/completions HTTP/1.1" 200 OK
litellm-7c566b99f6-klhwt litellm INFO:     10.5.0.3:48558 - "POST /chat/completions HTTP/1.1" 200 OK
litellm-7c566b99f6-lnxmz litellm 16:47:51 - LiteLLM Proxy:ERROR: proxy_server.py:951 - Update Key DB Call failed to execute - can only concatenate str (not "int") to str

Twitter / LinkedIn details

No response

ishaan-jaff commented 3 weeks ago

Can you share the curl request you sent ?

ishaan-jaff commented 2 weeks ago

following up on this @xingyaoww ^

stevencrake-nscale commented 2 weeks ago

Hi @ishaan-jaff I'm getting this too, so hopefully this helps. I've used the master key as auth.

Version: v1.52.9 using helm chart deployment.

Model list yaml

proxy_config:
  model_list:
    - model_name: nks-dev-llama-3-1-8b-instruct
      model_info:
        input_cost_per_token: 0.00001
        output_cost_per_token: 0.00005
      litellm_params:
        model: openai/meta-llama/Meta-Llama-3.1-8B-Instruct
        api_base: <redacted>
        api_key: <redacted>
curl -X POST http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer 0bXkq8SYHYYJn2s42" \
-d '{
    "model": "nks-dev-llama-3-1-8b-instruct",
    "messages": [
        {
            "role": "user",
            "content": "Hello!"
        }
    ]
}'
{"id":"chat-b01bec7b382d4a22920eeb8d27d07df2","choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello! It's nice to meet you. Is there something I can help you with or would you like to chat?","role":"assistant","tool_calls":null,"function_call":null}}],"created":1731941699,"model":"meta-llama/Meta-Llama-3.1-8B-Instruct","object":"chat.completion","system_fingerprint":null,"usage":{"completion_tokens":25,"prompt_tokens":37,"total_tokens":62,"completion_tokens_details":null,"prompt_tokens_details":null},"service_tier":null,"prompt_logprobs":null}
curl -X GET http://localhost:4000/v1/model/info \
-H "Content-Type: application/json" \
-H "Authorization: Bearer 0bXkq8SYHYYJn2s42"
{"data":[{"model_name":"nks-dev-llama-3-1-8b-instruct","litellm_params":{"api_base":"<redacted>","model":"openai/meta-llama/Meta-Llama-3.1-8B-Instruct"},"model_info":{"id":"0103c3829dbed1a82a512b648116289ee2b77d6bd88b356061257fbfb541d954","db_model":false,"input_cost_per_token":"1e-05","output_cost_per_token":"5e-05","key":"openai/meta-llama/Meta-Llama-3.1-8B-Instruct","max_tokens":null,"max_input_tokens":null,"max_output_tokens":null,"cache_creation_input_token_cost":null,"cache_read_input_token_cost":null,"input_cost_per_character":null,"input_cost_per_token_above_128k_tokens":null,"input_cost_per_query":null,"input_cost_per_second":null,"input_cost_per_audio_token":null,"output_cost_per_audio_token":null,"output_cost_per_character":null,"output_cost_per_token_above_128k_tokens":null,"output_cost_per_character_above_128k_tokens":null,"output_cost_per_second":null,"output_vector_size":null,"litellm_provider":"openai","mode":null,"supported_openai_params":["frequency_penalty","logit_bias","logprobs","top_logprobs","max_tokens","max_completion_tokens","modalities","n","presence_penalty","seed","stop","stream","stream_options","temperature","top_p","tools","tool_choice","function_call","functions","max_retries","extra_headers","parallel_tool_calls","response_format"],"supports_system_messages":null,"supports_response_schema":null,"supports_vision":false,"supports_function_calling":false,"supports_assistant_prefill":false,"supports_prompt_caching":false,"supports_audio_input":false,"supports_audio_output":false}}]}%

It looks like it's storing the cost per token in scientific notation (so, as a string)?

"input_cost_per_token":"1e-05","output_cost_per_token":"5e-05"

relevant-logs.txt