mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed inference
https://localai.io
MIT License
23.48k stars 1.79k forks source link

Metrics API Endpoint Error: * collected metric "api_call" { [...] } was collected before with the same name and label values #1445

Open countzero opened 9 months ago

countzero commented 9 months ago

LocalAI version: 7641f92

Environment, CPU architecture, OS, and Version:

Linux ... 5.15.0-91-generic #101-Ubuntu SMP Tue Nov 14 13:30:08 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Describe the bug The API endpoint /metrics works for some time and then after some completion requests it fails with an error message.

To Reproduce

  1. Start LocalAI
  2. Make some Requests to the /v1/chat/completions API Endpoint
  3. Check the /metrics API Endpoint

Expected behavior The /metrics API Endpoint should be robust.

Logs

An error has occurred while serving metrics:

2 error(s) occurred:

  • collected metric "api_call" { label:{name:"method" value:"GETT"} label:{name:"otel_scope_name" value:"github.com/go-skynet/LocalAI"} label:{name:"otel_scope_version" value:""} label:{name:"path" value:"/v1/chat/completions"} histogram:{sample_count:2 sample_sum:7358.754251354 bucket:{cumulative_count:0 upper_bound:0} bucket:{cumulative_count:0 upper_bound:5} bucket:{cumulative_count:0 upper_bound:10} bucket:{cumulative_count:0 upper_bound:25} bucket:{cumulative_count:0 upper_bound:50} bucket:{cumulative_count:0 upper_bound:75} bucket:{cumulative_count:0 upper_bound:100} bucket:{cumulative_count:0 upper_bound:250} bucket:{cumulative_count:0 upper_bound:500} bucket:{cumulative_count:0 upper_bound:750} bucket:{cumulative_count:0 upper_bound:1000} bucket:{cumulative_count:0 upper_bound:2500} bucket:{cumulative_count:2 upper_bound:5000} bucket:{cumulative_count:2 upper_bound:7500} bucket:{cumulative_count:2 upper_bound:10000}}} was collected before with the same name and label values
  • collected metric "api_call" { label:{name:"method" value:"GETT"} label:{name:"otel_scope_name" value:"github.com/go-skynet/LocalAI"} label:{name:"otel_scope_version" value:""} label:{name:"path" value:"/v1/chat/completions"} histogram:{sample_count:1 sample_sum:18.822380547 bucket:{cumulative_count:0 upper_bound:0} bucket:{cumulative_count:0 upper_bound:5} bucket:{cumulative_count:0 upper_bound:10} bucket:{cumulative_count:1 upper_bound:25} bucket:{cumulative_count:1 upper_bound:50} bucket:{cumulative_count:1 upper_bound:75} bucket:{cumulative_count:1 upper_bound:100} bucket:{cumulative_count:1 upper_bound:250} bucket:{cumulative_count:1 upper_bound:500} bucket:{cumulative_count:1 upper_bound:750} bucket:{cumulative_count:1 upper_bound:1000} bucket:{cumulative_count:1 upper_bound:2500} bucket:{cumulative_count:1 upper_bound:5

Additional context

We are using llama.cpp as a backend and enabled parallel requests:

PARALLEL_REQUESTS=true
LLAMACPP_PARALLEL=10

A benchmark script to produce some load on the system:

Measure-Command { 
    1..10 | % { `
        Start-Job -ScriptBlock { `
            curl.exe [...]/v1/chat/completions `
                --header "Content-Type: application/json" `
                --header "@${HOME}\authorization_header.txt" `
                --data '{
                    \"model\": \"dolphin-2_2-yi-34b.Q4_K_M\",
                    \"messages\": [
                        {
                            \"role\": \"system\",
                            \"content\": \"You are a helpfull assistant.\"
                        },
                        {
                            \"role\": \"user\",
                            \"content\": \"How are you?\"
                        }
                    ],
                    \"temperature\": 0.7
                }'
        }
    }
    Get-Job | Wait-Job | Receive-Job | Out-Host 
}
countzero commented 4 months ago

Error is still reproducable with https://github.com/mudler/LocalAI/releases/tag/v2.14.0

curl -v https://[...]/metrics

HTTP/1.1 500 Internal Server Error Alt-Svc: h3=":443"; ma=2592000 Content-Length: 5961 Content-Type: text/plain; charset=utf-8 Date: Tue, 07 May 2024 11:46:26 GMT Server: Caddy X-Content-Type-Options: nosniff

An error has occurred while serving metrics:

6 error(s) occurred:

Nold360 commented 3 months ago

I was running into this while adding a service monitor to the helm chart...

first it seems ok but after some load i get something like:

An error has occurred while serving metrics:

collected metric "api_call" { label:{name:"method"  value:"POST"}  label:{name:"otel_scope_name"  value:"github.com/go-skynet/LocalAI"}  label:{name:"otel_scope_version"  value:""}  label:{name:"path"  value:"/chat/completions"}  histogram:{sample_count:3  sample_sum:4.015941837  bucket:{cumulative_count:0  upper_bound:0}  bucket:{cumulative_count:3  upper_bound:5}  bucket:{cumulative_count:3  upper_bound:10}  bucket:{cumulative_count:3  upper_bound:25}  bucket:{cumulative_count:3  upper_bound:50}  bucket:{cumulative_count:3  upper_bound:75}  bucket:{cumulative_count:3  upper_bound:100}  bucket:{cumulative_count:3  upper_bound:250}  bucket:{cumulative_count:3  upper_bound:500}  bucket:{cumulative_count:3  upper_bound:750}  bucket:{cumulative_count:3  upper_bound:1000}  bucket:{cumulative_count:3  upper_bound:2500}  bucket:{cumulative_count:3  upper_bound:5000}  bucket:{cumulative_count:3  upper_bound:7500}  bucket:{cumulative_count:3  upper_bound:10000}}} was collected before with the same name and label values