elastic / apm-server

https://www.elastic.co/guide/en/apm/guide/current/index.html
Other
1.21k stars 518 forks source link

Collect APM Server throughput numbers #7842

Closed simitt closed 1 year ago

simitt commented 2 years ago

The goal is to replace the (current numbers) around processing and performance for APM Server.

We want to start out with collecting numbers for following scenarios:

This would lead to a total of 12 benchmark tests (3 provider-templates x 4 sizes).

In addition, we also want to collect numbers for otel events. We should capture the performance on at least one of the CSPs.

This is the ground work for an updated sizing guide (https://github.com/elastic/apm-server/issues/7840).

simitt commented 1 year ago

Updated the description to contain a concrete setup for the benchmarks.

dmathieu commented 1 year ago

We are going to run those benchmarks in Jenkins, which allows us to trigger them in parallel and without the need to keep a laptop from sleeping. The process, and results details are described in this document.

simitt commented 1 year ago

Labeled this as blocked until after 8.9.0 has been released. The benchmarks should include the protobuf changes as they have a relevant impact on performance. Tests should be carried out in a stable environment, therefore prefering to run them on production. CFT regions do not exist for aws and azure, thus we'll need to wait until the new version is publicly released, to be able to test against all three cloud providers.

dmathieu commented 1 year ago

This has been run and documented in the document linked in my previous comment above. The default system profiles have been updated to match good values here: https://github.com/elastic/apm-server/pull/11600

Once that PR is merged, I believe this issue can be closed.