Open kunalchamoli opened 3 days ago
@kunalchamoli - what tags are you sending in your request. Can I see a sample request you're sending
@ishaan-jaff sample curl My curl request is working fine as i can see medium tagged as well as default tagged model are receiving request, my question is why is default tag getting more requests(or even any request) than medium, if i am not hitting TPM/RPM limits of medium tagged model.
curl --location 'http://localhost:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "mock-llm",
"messages": [
{
"role": "system",
"content": "You are a helpful, pattern-following assistant."
},
{
"role": "user",
"content": "Help me translate the following corporate jargon into plain English."
},
{
"role": "assistant",
"content": "Sure, I'\''d be happy to!"
},
{
"role": "user",
"content": "New synergies will help drive top-line growth."
},
{
"role": "assistant",
"content": "Things working well together will increase revenue."
},
{
"role": "user",
"content": "Let'\''s circle back when we have more bandwidth to touch base on opportunities for increased leverage"
},
{
"role": "assistant",
"content": "Let'\''s talk later when we'\''re less busy about how to do better."
},
{
"role": "user",
"content": "This late pivot means we don'\''t have time to boil the ocean for the client deliverable."
}
],
"tags": ["medium"]
}'
are you expected default to get no requests when tags = medium @kunalchamoli ?
the original behavior was if a deployment is tagged as default it can be used for all tags + when no tags are sent
I expect default tag to get some requests, probably less than medium tagged model. But as i can see from my dashboard(model) default tag is consistently getting more requests than medium tagged. And i can't post dashboard photos or screenshot as it is restricted.
What happened?
I was trying to do tag based routing in LiteLLM, but it redirects to "default" tag more than the tag that i am passing in the request:
Example Config
Can you tell what i am doing wrong, as i can see i am not hitting tpm/rpm limits of any "medium" tagged model.
Relevant log output
No response
Twitter / LinkedIn details
No response