[Bug]: Prometheus monitors the number of requests collected in Azure OpenAI

BerriAI / litellm

Python SDK, Proxy Server to call 100+ LLM APIs using the OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

https://docs.litellm.ai/docs/

Other

11.6k stars 1.33k forks source link

[Bug]: Prometheus monitors the number of requests collected in Azure OpenAI #4617

Open Huyueeer opened 1 month ago

Huyueeer commented 1 month ago

What happened?

My configuration is this:

model_name: gpt-35-turbo-16k litellm_params: model: azure/bdcgpt3516 api_base: https://XXXXXXXXXX.com/ api_key: XXXXXXXXXXXXXXXXXXXXXXX api_version: "2024-02-15-preview" input_cost_per_token: 0.000003 output_cost_per_token: 0.000004

This configuration is actually one model, not two, and the number of requests does not match.

In fact, I only made 105 requests, but in the monitoring, I saw that these two models (actually one model) made many requests respectively, but there was no connection between them.

Relevant log output

No response

Twitter / LinkedIn details

No response

krrishdholakia commented 1 month ago

@Huyueeer how do i repro this?

Huyueeer commented 1 month ago

@Huyueeer how do i repro this?

@krrishdholakia Use Azure to build an openai model. The deployment name and model name are inconsistent. Use litellm-litellm:v1.41.11.dev1 docker deployment, enable prometheus monitoring, copy https://github.com/BerriAI/litellm/tree/main/cookbook/litellm_proxy_server/grafana_dashboard, enable monitoring in grafana, and concurrently request the Azure openai model just deployed, and you can see it.

Huyueeer commented 1 month ago

I wonder if it has something to do with the sum by (model) (increase(litellm_requests_metric_total[5m])) statement in grafana. I am not very familiar with this statement.