Open countzero opened 9 months ago
Error is still reproducable with https://github.com/mudler/LocalAI/releases/tag/v2.14.0
curl -v https://[...]/metrics
HTTP/1.1 500 Internal Server Error Alt-Svc: h3=":443"; ma=2592000 Content-Length: 5961 Content-Type: text/plain; charset=utf-8 Date: Tue, 07 May 2024 11:46:26 GMT Server: Caddy X-Content-Type-Options: nosniff
An error has occurred while serving metrics:
6 error(s) occurred:
I was running into this while adding a service monitor to the helm chart...
first it seems ok but after some load i get something like:
An error has occurred while serving metrics:
collected metric "api_call" { label:{name:"method" value:"POST"} label:{name:"otel_scope_name" value:"github.com/go-skynet/LocalAI"} label:{name:"otel_scope_version" value:""} label:{name:"path" value:"/chat/completions"} histogram:{sample_count:3 sample_sum:4.015941837 bucket:{cumulative_count:0 upper_bound:0} bucket:{cumulative_count:3 upper_bound:5} bucket:{cumulative_count:3 upper_bound:10} bucket:{cumulative_count:3 upper_bound:25} bucket:{cumulative_count:3 upper_bound:50} bucket:{cumulative_count:3 upper_bound:75} bucket:{cumulative_count:3 upper_bound:100} bucket:{cumulative_count:3 upper_bound:250} bucket:{cumulative_count:3 upper_bound:500} bucket:{cumulative_count:3 upper_bound:750} bucket:{cumulative_count:3 upper_bound:1000} bucket:{cumulative_count:3 upper_bound:2500} bucket:{cumulative_count:3 upper_bound:5000} bucket:{cumulative_count:3 upper_bound:7500} bucket:{cumulative_count:3 upper_bound:10000}}} was collected before with the same name and label values
LocalAI version: 7641f92
Environment, CPU architecture, OS, and Version:
Describe the bug The API endpoint
/metrics
works for some time and then after some completion requests it fails with an error message.To Reproduce
/v1/chat/completions
API Endpoint/metrics
API EndpointExpected behavior The
/metrics
API Endpoint should be robust.Logs
Additional context
We are using llama.cpp as a backend and enabled parallel requests:
A benchmark script to produce some load on the system: