cortexproject / cortex

A horizontally scalable, highly available, multi-tenant, long term Prometheus.
https://cortexmetrics.io/
Apache License 2.0
5.48k stars 802 forks source link

Automated performance benchmarks for Cortex #5107

Open yeya24 opened 1 year ago

yeya24 commented 1 year ago

Is your feature request related to a problem? Please describe. Right now, we don't have a way to identify and prevent performance regressions in Cortex. The reason is that we don't have way to track and compare performance metrics between different releases or commits.

Describe the solution you'd like Prometheus community has https://github.com/prometheus/test-infra which runs benchmark for Prometheus against a specific PR/release and visualize performance metrics via Grafana. It supports prombench which is a macro benchmark and funcbench which is a micro benchmark using Go benchmark test. It can be triggered by a bot and the bot will give more detailed information https://github.com/prometheus/prometheus/pull/11833#issuecomment-1377125745.

It would be neat to have similar benchmarks in Cortex and we can probably reuse some code from that.

Additional context Similar to https://github.com/thanos-io/thanos/issues/5764

yeya24 commented 1 year ago

We don't need to have all the functionalities supported by prombench since the beginning. It would be nice to start from nightly benchmarks first and then we can make it more flexible. Long data retention time for comparison would be great. A good UI is also something we want to have, for example https://perf.databend.rs/.

anonymousr007 commented 1 year ago

Hello @yeya24,

I am interested in this project. Can you please tell me the prerequisite other than Golang and Kubernetes? I have intermediate experience in GoLang and beginner experience in Kubernetes.

Best Regards Rishabh Singh

AnirudhBot commented 1 year ago

Hey @yeya24 , I find this particular issue really interesting as I have been exploring around service monitoring lately. I would love to be a value contributor to this issue as a LFX mentee and eventually many more! Looking forward :)

yeya24 commented 1 year ago

@anonymousr007 Prometheus experience will be great to have!

anonymousr007 commented 1 year ago

@anonymousr007 Prometheus experience will be great to have!

Hello @yeya24, I don't have but I am ready to learn, test and implement. Thank you.

Akash3121 commented 1 year ago

Hi @yeya24,

Prometheus experience will be great to have!

I'm willing to learn but not able to find resources, Could you please suggest any resources, It would be really helpful. Thank you!

epompeii commented 1 year ago

I've been working on a continuous benchmarking tool to solve this exact problem, Bencher: https://github.com/bencherdev/bencher Using it should be easier than trying to repurpose Prometheus's solution, and it seems to checks all of the boxes you put forth @yeya24 .

yeya24 commented 1 year ago

Thanks @epompeii, I am aware of such tools but they are for microbenchmarks only, which means macrobenchmarks are not covered.

epompeii commented 1 year ago

but they are for microbenchmarks only, which means macrobenchmarks are not covered.

I'm not sure what you mean by this? If it is in regard to the supported benchmark harnesses, yes you are correct they tend to be for microbenchmarks. However, Bencher supports any benchmarks, including any macrobenchmarks as long as the data can be output in the expected JSON format (https://bencher.dev/docs/explanation/adapters)

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.