cortexproject / cortex

A horizontally scalable, highly available, multi-tenant, long term Prometheus.
https://cortexmetrics.io/
Apache License 2.0
5.48k stars 802 forks source link

TestPrometheusCompatibilityQueryFuzz test failure with bottomk #6323

Open yeya24 opened 3 weeks ago

yeya24 commented 3 weeks ago

Describe the bug TestPrometheusCompatibilityQueryFuzz test failed.

See https://github.com/cortexproject/cortex/actions/runs/11732101325/job/32683790091?pr=6311#step:10:409

range query: bottomk by (job) (pi(), days_in_month(-{__name__="test_series_a"} offset 1m17s))

Cortex Response:

res1: {job="test", series="0", status_code="200"} =>
        31 @[1731012111.347]
        31 @[1731012171.347]
        31 @[1731012231.347]
        31 @[1731012291.347]
        31 @[1731012351.347]
        31 @[1731012[411](https://github.com/cortexproject/cortex/actions/runs/11732101325/job/32683790091?pr=6311#step:10:412).347]
        31 @[1731012471.347]
        31 @[1731012531.347]
        31 @[1731012591.347]
        31 @[1731012651.347]
        31 @[1731012711.347]
        31 @[1731012771.347]
        31 @[1731012831.347]
        31 @[1731012891.347]
        31 @[1731012951.347]
        31 @[1731013011.347]
        31 @[1731013071.347]
        31 @[1731013131.347]
        31 @[1731013191.347]
        31 @[1731013251.347]
        31 @[1731013311.347]
        31 @[1731013371.347]
        31 @[1731013[431](https://github.com/cortexproject/cortex/actions/runs/11732101325/job/32683790091?pr=6311#step:10:432).347]
        31 @[1731013491.347]
        31 @[1731013551.347]
        31 @[1731013611.347]
        31 @[1731013671.347]
        31 @[1731013731.347]
        31 @[1731013791.347]
        31 @[1731013851.347]
        31 @[1731013911.347]
        31 @[1731013971.347]
        {job="test", series="0", status_code="400"} =>
...

Prometheus response: 

res2: {job="test", series="0", status_code="500"} =>
        31 @[1731012771.347]
        31 @[1731012831.347]
        31 @[1731012891.347]
        31 @[1731012951.347]
        31 @[1731013011.347]
        31 @[1731013071.347]
        31 @[1731013131.347]
        31 @[1731013191.347]
        31 @[1731013251.347]
        31 @[1731013311.347]
        31 @[1731013371.347]
        31 @[1731013431.347]
        31 @[1731013491.347]
        31 @[1731013551.347]
        31 @[1731013611.347]
        31 @[1731013671.347]
        31 @[1731013731.347]
        31 @[1731013791.347]
        31 @[1731013851.347]
        31 @[1731013911.347]
        31 @[1731013971.347]
        31 @[1731014031.347]
        {job="test", series="0", status_code="502"} =>
...

More Context

It might be related to https://github.com/prometheus/prometheus/pull/14083 as Prometheus starts to always sort matrix response by labels.

SungJin1212 commented 1 week ago

@yeya24 In my local test, query: bottomk by (job) (pi(), days_in_month(-{__name__="test_series_a"} offset 1m17s)): fail query: topk by (job) (pi(), days_in_month(-{__name__="test_series_a"} offset 1m17s)): fail (topk case) query: bottomk by (job) (pi(), -{__name__=\"test_series_a\"} offset 1m17s): ok (not include days_in_month)

The result series values of days_in_month(-{__name__="test_series_a"} offset 1m17s) are all of the same (31). From my understanding, the bottomk and topk seem non-deterministic when the target input samples (2nd argument) values are the same.