m3db / m3

M3 monorepo - Distributed TSDB, Aggregator and Query Engine, Prometheus Sidecar, Graphite Compatible, Metrics Platform
https://m3db.io/
Apache License 2.0
4.74k stars 452 forks source link

Broken regex OR matching for series without a label #4212

Closed stek29 closed 11 months ago

stek29 commented 1 year ago

In m3 promql query label=~"value|" doesn't match series with label missing -- it only matches value. Prometheus and VictoriaMetrics handle this query differently.

For example, given following samples:

metric{label1="foo", label2="bar"} 1
metric{label1="foo"} 2
prometheus and m3 return different results for the same query: query prometheus m3db
metric{} 1 and 2 1 and 2
metric{label2=""} 2 2
metric{label2=~""} 2 2
metric{label2=~"\|"} 2 none
metric{label2=~"bar\|"} 1 and 2 only 1
metric{label1=foo, label2=~"bar\|"} 1 and 2 only 1

I'm able to reproduce this issue on v1.5.0 from quick start guide, see full steps and test case in spoiler below.

commands to reproduce ```sh docker run -d -p 7201:7201 -p 7203:7203 --name m3db -v $(pwd)/m3db_data:/var/lib/m3db quay.io/m3db/m3dbnode:v1.5.0 # wait for startup curl -X POST http://localhost:7201/api/v1/database/create -d '{ "type": "local", "namespaceName": "default", "retentionTime": "12h" }' | jq . # wait for initialization curl -X POST http://localhost:7201/api/v1/services/m3db/namespace/ready -d '{ "name": "default" }' | jq . # wait for it to be ready # now write the test series TS=$(date "+%s") curl -X POST http://localhost:7201/api/v1/json/write -d '{ "tags": { "__name__": "metric", "label1": "foo", "label2": "bar" }, "timestamp": '\"$TS\"', "value": 1 }' curl -X POST http://localhost:7201/api/v1/json/write -d '{ "tags": { "__name__": "metric", "label1": "foo" }, "timestamp": '\"$TS\"', "value": 2 }' # helper function do_query() { curl -qs -X "POST" -G "http://localhost:7201/api/v1/query_range" -d "query=$1" -d "start=$(date "+%s" -d "900 seconds ago")" -d "end=$( date +%s )" -d "step=5s" | jq -r '.data.result| map("{\(.metric|to_entries|map("\(.key)=\"\(.value)\"")|join(", "))} \(.values|last|last)")[]'; } # now do the tests test_queries=( 'metric{}' 'metric{label2=""}' 'metric{label2=~""}' 'metric{label2=~"|"}' 'metric{label2=~"bar|"}' 'metric{label1="foo",label2=~"bar|"}' ) for q in "${test_queries[@]}"; do echo ">>> $q" do_query "$q" done ``` test results: ```prometheus >>> metric{} {__name__="metric", label1="foo"} 2 {__name__="metric", label1="foo", label2="bar"} 1 >>> metric{label2=""} {__name__="metric", label1="foo"} 2 >>> metric{label2=~""} {__name__="metric", label1="foo"} 2 >>> metric{label2=~"|"} >>> metric{label2=~"bar|"} {__name__="metric", label1="foo", label2="bar"} 1 >>> metric{label1="foo",label2=~"bar|"} {__name__="metric", label1="foo", label2="bar"} 1 ```
stek29 commented 1 year ago

Real-life effect of this issue -- when using m3db with kubernetes mixin rules, apiserver_request:burnrate record is calculated incorrectly with m3 -- it counts all healthcheck requests as errors, which causes alert KubeAPIErrorBudgetBurn to flap depending on overall request rate to apiserver.

matthiasr commented 1 year ago

For real-world use cases, kube_pod_…{pod=~"|$pod",pod_name=~"|$pod"} was a common solution when Kubernetes transitioned between these label names.

robskillington commented 11 months ago

TY for this report, glad to see this gap closed and fixed!