fabiolb / fabio

Consul Load-Balancing made simple
https://fabiolb.net
MIT License
7.25k stars 620 forks source link

Panic on created prometheus metric name #878

Closed tino closed 1 year ago

tino commented 2 years ago

I'm running into panics on startup on rougly 2 out of 10 startups. It fails with:

2022/06/23 18:46:09 [INFO] Registering metrics provider "prometheus"
panic: descriptor Desc{fqName: "1ea85cb2f2e1_fabio_route", help: "", constLabels: {}, variableLabels: [service host path target]} is invalid: "1ea85cb2f2e1_fabio_route" is not a valid metric name

1ea85cb2f2e1 is the hostname of the container fabio runs in*.

Relevant config:

FABIO_proxy_addr                    = ":80,:81;proto=prometheus"
FABIO_metrics_target                = "prometheus"
FABIO_metrics_prometheus_buckets    = ".005,.01,.025,.05,.1,.25,.5,1,2.5,5,10,25,60"
Logs
2022/06/23 18:45:52 [INFO] Setting log level to INFO
2022/06/23 18:45:52 [INFO] Runtime config
{
    "Proxy": {
        "Strategy": "rnd",
        "Matcher": "prefix",
        "NoRouteStatus": 503,
        "MaxConn": 10000,
        "ShutdownWait": 0,
        "DialTimeout": 30000000000,
        "ResponseHeaderTimeout": 0,
        "KeepAliveTimeout": 0,
        "IdleConnTimeout": 15000000000,
        "FlushInterval": 1000000000,
        "GlobalFlushInterval": 0,
        "LocalIP": "172.17.0.4",
        "ClientIPHeader": "",
        "TLSHeader": "",
        "TLSHeaderValue": "",
        "GZIPContentTypes": {},
        "RequestID": "",
        "STSHeader": {
            "MaxAge": 0,
            "Subdomains": false,
            "Preload": false
        },
        "AuthSchemes": {}
    },
2022/06/23 18:46:09 [INFO] Setting log level to INFO
2022/06/23 18:46:09 [INFO] Runtime config
{
    "Proxy": {
        "Strategy": "rnd",
        "Matcher": "prefix",
        "NoRouteStatus": 503,
        "MaxConn": 10000,
        "ShutdownWait": 0,
        "DialTimeout": 30000000000,
        "ResponseHeaderTimeout": 0,
        "KeepAliveTimeout": 0,
        "IdleConnTimeout": 15000000000,
        "FlushInterval": 1000000000,
        "GlobalFlushInterval": 0,
        "LocalIP": "172.17.0.4",
        "ClientIPHeader": "",
        "TLSHeader": "",
        "TLSHeaderValue": "",
        "GZIPContentTypes": {},
        "RequestID": "",
        "STSHeader": {
            "MaxAge": 0,
            "Subdomains": false,
            "Preload": false
        },
        "AuthSchemes": {}
    },
    "Registry": {
        "Backend": "consul",
        "Static": {
            "NoRouteHTML": "",
            "Routes": ""
        },
        "File": {
            "NoRouteHTMLPath": "",
            "RoutesPath": ""
        },
        "Consul": {
            "Addr": "10.1.1.70:8500",
            "Scheme": "http",
            "Token": "",
            "KVPath": "/fabio-public/config",
            "NoRouteHTMLPath": "/fabio/noroute.html",
            "TagPrefix": "public-urlprefix-",
            "Register": true,
            "ServiceAddr": "10.1.1.70:9998",
            "ServiceName": "fabio",
            "ServiceTags": null,
            "ServiceStatus": [
                "passing"
            ],
            "CheckInterval": 1000000000,
            "CheckTimeout": 3000000000,
            "CheckScheme": "http",
            "CheckTLSSkipVerify": false,
            "ChecksRequired": "one",
            "ServiceMonitors": 1,
            "TLS": {
                "KeyFile": "",
                "CertFile": "",
                "CAFile": "",
                "CAPath": "",
                "InsecureSkipVerify": false
            },
            "PollInterval": 0
        },
        "Custom": {
            "Host": "",
            "Path": "",
            "QueryParams": "",
            "Scheme": "https",
            "CheckTLSSkipVerify": false,
            "PollInterval": 5,
            "NoRouteHTML": "",
            "Timeout": 10
        },
        "Timeout": 10000000000,
        "Retry": 500000000
    },
    "Listen": [
        {
            "Addr": ":80",
            "Proto": "http",
            "ReadTimeout": 0,
            "WriteTimeout": 0,
            "IdleTimeout": 0,
            "CertSource": {
                "Name": "",
                "Type": "",
                "CertPath": "",
                "KeyPath": "",
                "ClientCAPath": "",
                "CAUpgradeCN": "",
                "Refresh": 0,
                "Header": null,
                "VaultFetchToken": ""
            },
            "StrictMatch": false,
            "TLSMinVersion": 0,
            "TLSMaxVersion": 0,
            "TLSCiphers": null,
            "ProxyProto": false,
            "ProxyHeaderTimeout": 0,
            "Refresh": 0
        },
        {
            "Addr": ":81",
            "Proto": "prometheus",
            "ReadTimeout": 0,
            "WriteTimeout": 0,
            "IdleTimeout": 0,
            "CertSource": {
                "Name": "",
                "Type": "",
                "CertPath": "",
                "KeyPath": "",
                "ClientCAPath": "",
                "CAUpgradeCN": "",
                "Refresh": 0,
                "Header": null,
                "VaultFetchToken": ""
            },
            "StrictMatch": false,
            "TLSMinVersion": 0,
            "TLSMaxVersion": 0,
            "TLSCiphers": null,
            "ProxyProto": false,
            "ProxyHeaderTimeout": 0,
            "Refresh": 0
        }
    ],
    "Log": {
        "AccessFormat": "{\"host\":\"$request_host\",\"remote\":\"$remote_host\",\"proto\":\"$request_proto\",\"timestamp\":\"$time_common\",\"referer\":\"$header.Referer\",\"service\":\"$upstream_service\",\"url\":\"$request_uri\",\"method\":\"$request_method\",\"body_size\":$response_body_size,\"status\":$response_status,\"upstream_addr\":\"$upstream_addr\",\"agent\":\"$header.User-Agent\",\"duration\":$response_time_ms}",
        "AccessTarget": "stdout",
        "RoutesFormat": "delta",
        "Level": "INFO"
    },
    "Metrics": {
        "Target": "prometheus",
        "Prefix": "{{clean .Hostname}}.{{clean .Exec}}",
        "Names": "{{clean .Service}}.{{clean .Host}}.{{clean .Path}}.{{clean .TargetURL.Host}}",
        "Interval": 30000000000,
        "Timeout": 10000000000,
        "Retry": 500000000,
        "GraphiteAddr": "",
        "StatsDAddr": "",
        "DogstatsdAddr": "",
        "Circonus": {
            "APIKey": "",
            "APIApp": "fabio",
            "APIURL": "",
            "CheckID": "",
            "BrokerID": "",
            "SubmissionURL": ""
        },
        "Prometheus": {
            "Subsystem": "",
            "Path": "/metrics",
            "Buckets": [
                0.005,
                0.01,
                0.025,
                0.05,
                0.1,
                0.25,
                0.5,
                1,
                2.5,
                5,
                10,
                25,
                60
            ]
        }
    },
    "UI": {
        "Listen": {
            "Addr": ":9998",
            "Proto": "http",
            "ReadTimeout": 0,
            "WriteTimeout": 0,
            "IdleTimeout": 0,
            "CertSource": {
                "Name": "",
                "Type": "",
                "CertPath": "",
                "KeyPath": "",
                "ClientCAPath": "",
                "CAUpgradeCN": "",
                "Refresh": 0,
                "Header": null,
                "VaultFetchToken": ""
            },
            "StrictMatch": false,
            "TLSMinVersion": 0,
            "TLSMaxVersion": 0,
            "TLSCiphers": null,
            "ProxyProto": false,
            "ProxyHeaderTimeout": 0,
            "Refresh": 0
        },
        "Color": "light-green",
        "Title": "",
        "Access": "rw"
    },
    "Runtime": {
        "GOGC": 100,
        "GOMAXPROCS": 2
    },
    "Tracing": {
        "TracingEnabled": false,
        "CollectorType": "http",
        "ConnectString": "http://localhost:9411/api/v1/spans",
        "ServiceName": "Fabiolb",
        "Topic": "Fabiolb-Kafka-Topic",
        "SamplerRate": -1,
        "SpanHost": "localhost:9998",
        "SpanName": "",
        "TraceID128Bit": true
    },
    "ProfileMode": "",
    "ProfilePath": "/tmp",
    "Insecure": false,
    "GlobMatchingDisabled": false,
    "GlobCacheSize": 1000
}
2022/06/23 18:46:09 [INFO] Version 1.6.0 starting
2022/06/23 18:46:09 [INFO] Go runtime is go1.18
2022/06/23 18:46:09 [INFO] Running fabio as UID=0 EUID=0 GID=0
2022/06/23 18:46:09 [WARN] 

    ************************************************************
    You are running fabio as root without the '-insecure' flag
    This will stop working with fabio 1.7!
    ************************************************************

2022/06/23 18:46:09 [INFO] Running fabio as UID=0 EUID=0 GID=0
2022/06/23 18:46:09 [WARN] 

    ************************************************************
    You are running fabio as root without the '-insecure' flag
    This will stop working with fabio 1.7!
    ************************************************************

2022/06/23 18:46:09 [INFO] Registering metrics provider "prometheus"
panic: descriptor Desc{fqName: "1ea85cb2f2e1_fabio_route", help: "", constLabels: {}, variableLabels: [service host path target]} is invalid: "1ea85cb2f2e1_fabio_route" is not a valid metric name

goroutine 1 [running]:
github.com/prometheus/client_golang/prometheus.(*Registry).MustRegister(0xc00022b920?, {0xc0002805b0?, 0x1, 0x0?})
    /Users/njohnson/jetbrains/fabio/pkg/mod/github.com/prometheus/client_golang@v1.4.0/prometheus/registry.go:400 +0x7f
github.com/prometheus/client_golang/prometheus.MustRegister(...)
    /Users/njohnson/jetbrains/fabio/pkg/mod/github.com/prometheus/client_golang@v1.4.0/prometheus/registry.go:177
github.com/go-kit/kit/metrics/prometheus.NewHistogramFrom({{0xc00022b920, 0x12}, {0x0, 0x0}, {0xcb84c0, 0x5}, {0x0, 0x0}, 0x0, {0xc0000b4580, ...}}, ...)
    /Users/njohnson/jetbrains/fabio/pkg/mod/github.com/go-kit/kit@v0.9.0/metrics/prometheus/prometheus.go:135 +0xb7
github.com/fabiolb/fabio/metrics.(*PromProvider).NewHistogram(0xc0000798c0, {0xcb84c0?, 0xc0000f1de0?}, {0xc00024a740, 0x4, 0x4})
    /Users/njohnson/jetbrains/fabio/src/github.com/fabiolb/fabio/metrics/provider_prometheus.go:49 +0x139
github.com/fabiolb/fabio/metrics.(*MultiProvider).NewHistogram(0xc0000021a0?, {0xcb84c0, 0x5}, {0xc00024a740, 0x4, 0x4})
    /Users/njohnson/jetbrains/fabio/src/github.com/fabiolb/fabio/metrics/provider_multi.go:37 +0x15c
github.com/fabiolb/fabio/route.SetMetricsProvider({0xe2fc78, 0xc00000dc08})
    /Users/njohnson/jetbrains/fabio/src/github.com/fabiolb/fabio/route/table.go:46 +0xa7
main.main()
    /Users/njohnson/jetbrains/fabio/src/github.com/fabiolb/fabio/main.go:128 +0x6c5

NB *Why is the hostname prepended? It would be much more logical and useful (IMHO) if that is set as a label. Now I have to apply metric relabelling in prometheus to get aggregated stats (or am I doing something wrong?)

nathanejohnson commented 2 years ago

The hostname is part of the default metrics.prefix, as documented here:

https://fabiolb.net/ref/metrics.prefix/

This can of course be overridden. The reason for the panic is that your hostname starts with a number, which is an invalid prometheus metric name as documented here:

https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels

Can you verify that defining a metrics.prefix fixes your issue?

Thanks!

tino commented 1 year ago

I can verify that setting the prefix works!

Thanks