Closed pedro-stanaka closed 6 months ago
Hm, do we want to track processed sample for each operator, or only for scanners?
I thought of adding to operators because in some cases you lose context of usage, like in Execution operator.
Maybe we can run benchmarks to see if counting each sample individually is going to have a significant perf penalty.
Maybe we can run benchmarks to see if counting each sample individually is going to have a significant perf penalty.
The change is pretty minimal, you can check the benchmarks on the PR description.
@fpetkovski do you have any extra comments/suggestions?
Nice work @fpetkovski @pedro-stanaka. Another question, we can also support the max samples limit on top of this feature?
Summary
Would be nice to get an idea of how many samples are being loaded to answer queries, in this PR I added some information about loaded samples for each operator using the existing Prometheus stats.QuerySamples model. This will allow the compatibilityQuery to play nicely with the upstream API and implement at least part of the Stats() method.
Bench results (against main)
Toggle me!
`new.out` is `main` ``` goos: darwin goarch: arm64 pkg: github.com/thanos-io/promql-engine/engine │ benchmarks/new.out │ benchmarks/new_samples.out │ │ sec/op │ sec/op vs base │ RangeQuery/vector_selector-11 12.05m ± ∞ ¹ 11.54m ± ∞ ¹ ~ (p=0.095 n=5) RangeQuery/sum-11 7.985m ± ∞ ¹ 8.367m ± ∞ ¹ ~ (p=0.690 n=5) RangeQuery/sum_by_pod-11 13.27m ± ∞ ¹ 12.91m ± ∞ ¹ ~ (p=0.222 n=5) RangeQuery/topk-11 8.286m ± ∞ ¹ 8.013m ± ∞ ¹ -3.30% (p=0.016 n=5) RangeQuery/bottomk-11 7.959m ± ∞ ¹ 8.168m ± ∞ ¹ +2.62% (p=0.008 n=5) RangeQuery/rate-11 13.57m ± ∞ ¹ 13.75m ± ∞ ¹ ~ (p=0.310 n=5) RangeQuery/subquery-11 31.20m ± ∞ ¹ 31.38m ± ∞ ¹ ~ (p=0.310 n=5) RangeQuery/sum_rate-11 10.44m ± ∞ ¹ 10.45m ± ∞ ¹ ~ (p=0.548 n=5) RangeQuery/sum_by_rate-11 13.37m ± ∞ ¹ 13.36m ± ∞ ¹ ~ (p=1.000 n=5) RangeQuery/quantile_with_variable_parameter-11 27.61m ± ∞ ¹ 27.98m ± ∞ ¹ ~ (p=0.548 n=5) RangeQuery/binary_operation_with_one_to_one-11 9.103m ± ∞ ¹ 9.244m ± ∞ ¹ +1.55% (p=0.008 n=5) RangeQuery/binary_operation_with_many_to_one-11 23.55m ± ∞ ¹ 23.95m ± ∞ ¹ +1.69% (p=0.008 n=5) RangeQuery/binary_operation_with_vector_and_scalar-11 16.17m ± ∞ ¹ 16.17m ± ∞ ¹ ~ (p=0.841 n=5) RangeQuery/unary_negation-11 12.06m ± ∞ ¹ 12.32m ± ∞ ¹ +2.15% (p=0.016 n=5) RangeQuery/vector_and_scalar_comparison-11 16.43m ± ∞ ¹ 16.64m ± ∞ ¹ ~ (p=0.222 n=5) RangeQuery/positive_offset_vector-11 11.06m ± ∞ ¹ 11.31m ± ∞ ¹ ~ (p=0.421 n=5) RangeQuery/at_modifier_-11 8.102m ± ∞ ¹ 8.197m ± ∞ ¹ ~ (p=0.841 n=5) RangeQuery/at_modifier_with_positive_offset_vector-11 8.064m ± ∞ ¹ 8.298m ± ∞ ¹ ~ (p=0.548 n=5) RangeQuery/clamp-11 15.22m ± ∞ ¹ 15.58m ± ∞ ¹ ~ (p=0.548 n=5) RangeQuery/clamp_min-11 13.61m ± ∞ ¹ 15.73m ± ∞ ¹ +15.61% (p=0.016 n=5) RangeQuery/complex_func_query-11 19.33m ± ∞ ¹ 20.76m ± ∞ ¹ +7.42% (p=0.008 n=5) RangeQuery/func_within_func_query-11 17.33m ± ∞ ¹ 17.72m ± ∞ ¹ ~ (p=0.690 n=5) RangeQuery/aggr_within_func_query-11 17.75m ± ∞ ¹ 17.68m ± ∞ ¹ ~ (p=0.548 n=5) RangeQuery/histogram_quantile-11 79.68m ± ∞ ¹ 73.22m ± ∞ ¹ -8.11% (p=0.016 n=5) RangeQuery/sort-11 12.75m ± ∞ ¹ 13.54m ± ∞ ¹ ~ (p=0.095 n=5) RangeQuery/sort_desc-11 12.90m ± ∞ ¹ 13.61m ± ∞ ¹ ~ (p=0.690 n=5) RangeQuery/absent_and_exists-11 7.879m ± ∞ ¹ 6.909m ± ∞ ¹ ~ (p=0.095 n=5) RangeQuery/absent_and_doesnt_exist-11 280.8µ ± ∞ ¹ 273.9µ ± ∞ ¹ ~ (p=0.222 n=5) NativeHistograms/selector-11 91.13m ± ∞ ¹ 86.82m ± ∞ ¹ ~ (p=0.222 n=5) NativeHistograms/sum-11 132.0m ± ∞ ¹ 130.4m ± ∞ ¹ ~ (p=0.056 n=5) NativeHistograms/rate-11 119.1m ± ∞ ¹ 113.0m ± ∞ ¹ -5.08% (p=0.008 n=5) NativeHistograms/sum_rate-11 158.0m ± ∞ ¹ 151.8m ± ∞ ¹ -3.91% (p=0.016 n=5) NativeHistograms/histogram_sum-11 301.9m ± ∞ ¹ 303.7m ± ∞ ¹ ~ (p=1.000 n=5) NativeHistograms/histogram_count-11 303.5m ± ∞ ¹ 321.3m ± ∞ ¹ +5.87% (p=0.008 n=5) NativeHistograms/histogram_quantile-11 141.3m ± ∞ ¹ 151.9m ± ∞ ¹ ~ (p=0.151 n=5) NativeHistograms/histogram_scalar_binop-11 219.0m ± ∞ ¹ 228.5m ± ∞ ¹ +4.37% (p=0.008 n=5) geomean 21.81m 21.99m +0.80% ¹ need >= 6 samples for confidence interval at level 0.95 │ benchmarks/new.out │ benchmarks/new_samples.out │ │ B/op │ B/op vs base │ RangeQuery/vector_selector-11 25.56Mi ± ∞ ¹ 25.58Mi ± ∞ ¹ ~ (p=0.151 n=5) RangeQuery/sum-11 6.249Mi ± ∞ ¹ 6.253Mi ± ∞ ¹ ~ (p=0.421 n=5) RangeQuery/sum_by_pod-11 13.23Mi ± ∞ ¹ 13.24Mi ± ∞ ¹ ~ (p=0.841 n=5) RangeQuery/topk-11 8.972Mi ± ∞ ¹ 8.998Mi ± ∞ ¹ +0.29% (p=0.008 n=5) RangeQuery/bottomk-11 8.961Mi ± ∞ ¹ 8.983Mi ± ∞ ¹ +0.24% (p=0.008 n=5) RangeQuery/rate-11 26.79Mi ± ∞ ¹ 26.79Mi ± ∞ ¹ ~ (p=0.548 n=5) RangeQuery/subquery-11 29.52Mi ± ∞ ¹ 29.51Mi ± ∞ ¹ ~ (p=0.310 n=5) RangeQuery/sum_rate-11 9.358Mi ± ∞ ¹ 9.262Mi ± ∞ ¹ ~ (p=0.548 n=5) RangeQuery/sum_by_rate-11 17.53Mi ± ∞ ¹ 17.56Mi ± ∞ ¹ ~ (p=0.056 n=5) RangeQuery/quantile_with_variable_parameter-11 30.12Mi ± ∞ ¹ 30.12Mi ± ∞ ¹ ~ (p=0.548 n=5) RangeQuery/binary_operation_with_one_to_one-11 14.20Mi ± ∞ ¹ 14.21Mi ± ∞ ¹ ~ (p=1.000 n=5) RangeQuery/binary_operation_with_many_to_one-11 34.66Mi ± ∞ ¹ 34.72Mi ± ∞ ¹ ~ (p=1.000 n=5) RangeQuery/binary_operation_with_vector_and_scalar-11 30.69Mi ± ∞ ¹ 30.70Mi ± ∞ ¹ ~ (p=0.310 n=5) RangeQuery/unary_negation-11 28.17Mi ± ∞ ¹ 28.16Mi ± ∞ ¹ ~ (p=1.000 n=5) RangeQuery/vector_and_scalar_comparison-11 30.16Mi ± ∞ ¹ 30.15Mi ± ∞ ¹ ~ (p=0.548 n=5) RangeQuery/positive_offset_vector-11 26.27Mi ± ∞ ¹ 26.28Mi ± ∞ ¹ ~ (p=0.151 n=5) RangeQuery/at_modifier_-11 22.67Mi ± ∞ ¹ 22.67Mi ± ∞ ¹ ~ (p=0.151 n=5) RangeQuery/at_modifier_with_positive_offset_vector-11 22.48Mi ± ∞ ¹ 22.48Mi ± ∞ ¹ ~ (p=0.548 n=5) RangeQuery/clamp-11 27.76Mi ± ∞ ¹ 27.78Mi ± ∞ ¹ ~ (p=0.095 n=5) RangeQuery/clamp_min-11 27.73Mi ± ∞ ¹ 27.75Mi ± ∞ ¹ ~ (p=0.421 n=5) RangeQuery/complex_func_query-11 31.34Mi ± ∞ ¹ 31.42Mi ± ∞ ¹ ~ (p=0.095 n=5) RangeQuery/func_within_func_query-11 29.13Mi ± ∞ ¹ 29.14Mi ± ∞ ¹ +0.03% (p=0.008 n=5) RangeQuery/aggr_within_func_query-11 29.14Mi ± ∞ ¹ 29.14Mi ± ∞ ¹ ~ (p=0.841 n=5) RangeQuery/histogram_quantile-11 48.97Mi ± ∞ ¹ 48.95Mi ± ∞ ¹ ~ (p=1.000 n=5) RangeQuery/sort-11 27.11Mi ± ∞ ¹ 27.12Mi ± ∞ ¹ ~ (p=0.548 n=5) RangeQuery/sort_desc-11 27.10Mi ± ∞ ¹ 27.11Mi ± ∞ ¹ ~ (p=0.095 n=5) RangeQuery/absent_and_exists-11 8.581Mi ± ∞ ¹ 8.151Mi ± ∞ ¹ -5.02% (p=0.016 n=5) RangeQuery/absent_and_doesnt_exist-11 570.9Ki ± ∞ ¹ 570.8Ki ± ∞ ¹ ~ (p=0.841 n=5) NativeHistograms/selector-11 415.0Mi ± ∞ ¹ 415.0Mi ± ∞ ¹ ~ (p=0.690 n=5) NativeHistograms/sum-11 397.4Mi ± ∞ ¹ 397.4Mi ± ∞ ¹ ~ (p=0.310 n=5) NativeHistograms/rate-11 370.7Mi ± ∞ ¹ 370.7Mi ± ∞ ¹ ~ (p=0.421 n=5) NativeHistograms/sum_rate-11 353.2Mi ± ∞ ¹ 353.2Mi ± ∞ ¹ ~ (p=0.548 n=5) NativeHistograms/histogram_sum-11 416.3Mi ± ∞ ¹ 416.4Mi ± ∞ ¹ ~ (p=0.421 n=5) NativeHistograms/histogram_count-11 416.3Mi ± ∞ ¹ 416.3Mi ± ∞ ¹ ~ (p=1.000 n=5) NativeHistograms/histogram_quantile-11 411.7Mi ± ∞ ¹ 397.5Mi ± ∞ ¹ ~ (p=0.222 n=5) NativeHistograms/histogram_scalar_binop-11 581.8Mi ± ∞ ¹ 582.1Mi ± ∞ ¹ ~ (p=0.056 n=5) geomean 37.24Mi 37.16Mi -0.22% ¹ need >= 6 samples for confidence interval at level 0.95 │ benchmarks/new.out │ benchmarks/new_samples.out │ │ allocs/op │ allocs/op vs base │ RangeQuery/vector_selector-11 49.09k ± ∞ ¹ 49.10k ± ∞ ¹ ~ (p=0.056 n=5) RangeQuery/sum-11 47.61k ± ∞ ¹ 47.61k ± ∞ ¹ ~ (p=0.730 n=5) RangeQuery/sum_by_pod-11 66.48k ± ∞ ¹ 66.48k ± ∞ ¹ ~ (p=0.460 n=5) RangeQuery/topk-11 44.76k ± ∞ ¹ 44.77k ± ∞ ¹ +0.03% (p=0.016 n=5) RangeQuery/bottomk-11 44.75k ± ∞ ¹ 44.76k ± ∞ ¹ ~ (p=0.508 n=5) RangeQuery/rate-11 64.11k ± ∞ ¹ 64.09k ± ∞ ¹ -0.03% (p=0.032 n=5) RangeQuery/subquery-11 84.40k ± ∞ ¹ 84.38k ± ∞ ¹ -0.03% (p=0.008 n=5) RangeQuery/sum_rate-11 92.35k ± ∞ ¹ 92.33k ± ∞ ¹ ~ (p=0.841 n=5) RangeQuery/sum_by_rate-11 112.1k ± ∞ ¹ 112.2k ± ∞ ¹ +0.06% (p=0.008 n=5) RangeQuery/quantile_with_variable_parameter-11 450.5k ± ∞ ¹ 450.6k ± ∞ ¹ +0.02% (p=0.032 n=5) RangeQuery/binary_operation_with_one_to_one-11 63.85k ± ∞ ¹ 64.00k ± ∞ ¹ +0.23% (p=0.008 n=5) RangeQuery/binary_operation_with_many_to_one-11 121.0k ± ∞ ¹ 121.3k ± ∞ ¹ +0.21% (p=0.008 n=5) RangeQuery/binary_operation_with_vector_and_scalar-11 93.83k ± ∞ ¹ 93.90k ± ∞ ¹ +0.07% (p=0.008 n=5) RangeQuery/unary_negation-11 92.54k ± ∞ ¹ 92.60k ± ∞ ¹ +0.06% (p=0.008 n=5) RangeQuery/vector_and_scalar_comparison-11 84.81k ± ∞ ¹ 84.86k ± ∞ ¹ +0.05% (p=0.008 n=5) RangeQuery/positive_offset_vector-11 69.03k ± ∞ ¹ 69.08k ± ∞ ¹ +0.07% (p=0.008 n=5) RangeQuery/at_modifier_-11 54.39k ± ∞ ¹ 54.41k ± ∞ ¹ +0.03% (p=0.008 n=5) RangeQuery/at_modifier_with_positive_offset_vector-11 48.39k ± ∞ ¹ 48.41k ± ∞ ¹ +0.04% (p=0.008 n=5) RangeQuery/clamp-11 93.37k ± ∞ ¹ 93.45k ± ∞ ¹ +0.09% (p=0.008 n=5) RangeQuery/clamp_min-11 92.95k ± ∞ ¹ 93.02k ± ∞ ¹ +0.08% (p=0.008 n=5) RangeQuery/complex_func_query-11 103.8k ± ∞ ¹ 103.9k ± ∞ ¹ +0.08% (p=0.016 n=5) RangeQuery/func_within_func_query-11 108.6k ± ∞ ¹ 108.6k ± ∞ ¹ +0.04% (p=0.008 n=5) RangeQuery/aggr_within_func_query-11 108.6k ± ∞ ¹ 108.6k ± ∞ ¹ +0.02% (p=0.008 n=5) RangeQuery/histogram_quantile-11 587.8k ± ∞ ¹ 587.7k ± ∞ ¹ -0.02% (p=0.008 n=5) RangeQuery/sort-11 83.51k ± ∞ ¹ 83.58k ± ∞ ¹ +0.08% (p=0.008 n=5) RangeQuery/sort_desc-11 83.50k ± ∞ ¹ 83.57k ± ∞ ¹ +0.09% (p=0.008 n=5) RangeQuery/absent_and_exists-11 77.45k ± ∞ ¹ 77.25k ± ∞ ¹ ~ (p=0.056 n=5) RangeQuery/absent_and_doesnt_exist-11 2.872k ± ∞ ¹ 2.868k ± ∞ ¹ -0.14% (p=0.008 n=5) NativeHistograms/selector-11 5.197M ± ∞ ¹ 5.197M ± ∞ ¹ ~ (p=0.690 n=5) NativeHistograms/sum-11 5.193M ± ∞ ¹ 5.193M ± ∞ ¹ ~ (p=1.000 n=5) NativeHistograms/rate-11 7.558M ± ∞ ¹ 7.558M ± ∞ ¹ ~ (p=1.000 n=5) NativeHistograms/sum_rate-11 7.554M ± ∞ ¹ 7.554M ± ∞ ¹ -0.00% (p=0.008 n=5) NativeHistograms/histogram_sum-11 5.207M ± ∞ ¹ 5.207M ± ∞ ¹ ~ (p=0.841 n=5) NativeHistograms/histogram_count-11 5.207M ± ∞ ¹ 5.207M ± ∞ ¹ ~ (p=0.690 n=5) NativeHistograms/histogram_quantile-11 5.263M ± ∞ ¹ 5.194M ± ∞ ¹ ~ (p=0.151 n=5) NativeHistograms/histogram_scalar_binop-11 8.882M ± ∞ ¹ 8.882M ± ∞ ¹ ~ (p=0.421 n=5) geomean 204.5k 204.5k -0.01% ¹ need >= 6 samples for confidence interval at level 0.95 ```