thanos-io / thanos

Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
https://thanos.io
Apache License 2.0
12.7k stars 2.03k forks source link

Receiver: cache matchers for series calls #7353

Open pedro-stanaka opened 1 month ago

pedro-stanaka commented 1 month ago

Summary

We have tried caching matchers before with a time-based expiration cache, this time we are trying with LRU cache.

We saw some of our receivers busy with compiling regexes and with high CPU usage, similar to the profile of the benchmark I added here:

image

Benchmark results

Expand! ``` Result on store-proxy-cache-matchers BenchmarkProxySeriesRegex-11 1545795 768.7 ns/op 1144 B/op 19 allocs/op BenchmarkProxySeriesRegex-11 1548040 769.4 ns/op 1144 B/op 19 allocs/op BenchmarkProxySeriesRegex-11 1545019 778.3 ns/op 1144 B/op 19 allocs/op BenchmarkProxySeriesRegex-11 1539387 771.1 ns/op 1144 B/op 19 allocs/op Result on main BenchmarkProxySeriesRegex-11 130292 8803 ns/op 10288 B/op 78 allocs/op BenchmarkProxySeriesRegex-11 124045 8533 ns/op 10288 B/op 78 allocs/op BenchmarkProxySeriesRegex-11 125092 8712 ns/op 10288 B/op 78 allocs/op BenchmarkProxySeriesRegex-11 120110 8676 ns/op 10288 B/op 78 allocs/op ``` The results indicate that the "store-proxy-cache-matchers" branch considerably outperforms the "main" branch in all observed aspects of the BenchmarkProxySeriesRegex function. It is roughly 10 times faster regarding execution time while using about 9 times less memory and making about 4 times fewer allocations per operation. These improvements suggest significant optimizations in the regex handling or related data processing in the "store-proxy-cache-matchers" branch compared to the "main" branch

Changes

Verification