Open xingyaoww opened 3 weeks ago
Can you share the curl request you sent ?
following up on this @xingyaoww ^
Hi @ishaan-jaff I'm getting this too, so hopefully this helps. I've used the master key as auth.
Version: v1.52.9 using helm chart deployment.
Model list yaml
proxy_config:
model_list:
- model_name: nks-dev-llama-3-1-8b-instruct
model_info:
input_cost_per_token: 0.00001
output_cost_per_token: 0.00005
litellm_params:
model: openai/meta-llama/Meta-Llama-3.1-8B-Instruct
api_base: <redacted>
api_key: <redacted>
curl -X POST http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer 0bXkq8SYHYYJn2s42" \
-d '{
"model": "nks-dev-llama-3-1-8b-instruct",
"messages": [
{
"role": "user",
"content": "Hello!"
}
]
}'
{"id":"chat-b01bec7b382d4a22920eeb8d27d07df2","choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello! It's nice to meet you. Is there something I can help you with or would you like to chat?","role":"assistant","tool_calls":null,"function_call":null}}],"created":1731941699,"model":"meta-llama/Meta-Llama-3.1-8B-Instruct","object":"chat.completion","system_fingerprint":null,"usage":{"completion_tokens":25,"prompt_tokens":37,"total_tokens":62,"completion_tokens_details":null,"prompt_tokens_details":null},"service_tier":null,"prompt_logprobs":null}
curl -X GET http://localhost:4000/v1/model/info \
-H "Content-Type: application/json" \
-H "Authorization: Bearer 0bXkq8SYHYYJn2s42"
{"data":[{"model_name":"nks-dev-llama-3-1-8b-instruct","litellm_params":{"api_base":"<redacted>","model":"openai/meta-llama/Meta-Llama-3.1-8B-Instruct"},"model_info":{"id":"0103c3829dbed1a82a512b648116289ee2b77d6bd88b356061257fbfb541d954","db_model":false,"input_cost_per_token":"1e-05","output_cost_per_token":"5e-05","key":"openai/meta-llama/Meta-Llama-3.1-8B-Instruct","max_tokens":null,"max_input_tokens":null,"max_output_tokens":null,"cache_creation_input_token_cost":null,"cache_read_input_token_cost":null,"input_cost_per_character":null,"input_cost_per_token_above_128k_tokens":null,"input_cost_per_query":null,"input_cost_per_second":null,"input_cost_per_audio_token":null,"output_cost_per_audio_token":null,"output_cost_per_character":null,"output_cost_per_token_above_128k_tokens":null,"output_cost_per_character_above_128k_tokens":null,"output_cost_per_second":null,"output_vector_size":null,"litellm_provider":"openai","mode":null,"supported_openai_params":["frequency_penalty","logit_bias","logprobs","top_logprobs","max_tokens","max_completion_tokens","modalities","n","presence_penalty","seed","stop","stream","stream_options","temperature","top_p","tools","tool_choice","function_call","functions","max_retries","extra_headers","parallel_tool_calls","response_format"],"supports_system_messages":null,"supports_response_schema":null,"supports_vision":false,"supports_function_calling":false,"supports_assistant_prefill":false,"supports_prompt_caching":false,"supports_audio_input":false,"supports_audio_output":false}}]}%
It looks like it's storing the cost per token in scientific notation (so, as a string)?
"input_cost_per_token":"1e-05","output_cost_per_token":"5e-05"
What happened?
I was running litellm-proxy
ghcr.io/berriai/litellm:main-v1.52.0.dev20
, and got the following errorsRelevant log output
Twitter / LinkedIn details
No response