Open nacartwright opened 2 months ago
@kevinintel The metrics explorer in prometheus does not show any vLLM related metrics.
You should connect to vLLM serving endpoint, not LLM microservice endpoint.
please modify endpoint in prometheus.yml
ex:
static_configs:
@nacartwright I was able to successfully test on a local machine with the following config for prometheus
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "vllm"
static_configs:
- targets: ["external_ip:port"]
The vLLM metrics are not showing correctly. It should show time to first token, number of running requests, cpu/gpu cache...etc as shown here:
https://github.com/vllm-project/vllm/blob/main/vllm/engine/metrics.py