Closed Zelldon closed 3 days ago
maybe also something we want to take a look at the hackweek
It makes sense but we would only optimize for a certain setup e.g. our benchmark, this doesn't necessarily benefit users. User may still need to optimize for their setup.
@falko do you have any input on this based on your experiences?
One round of benchmark done with the default 32 kb payload @ 150 PI/S, grafana link for all scenarios: Dashboard
helm upgrade --reuse-values --set zeebe.config.zeebe.broker.experimental.maxAppendsPerFollower=6 --set zeebe.config.zeebe.broker.experimental.maxAppendBatchSize=64kb --namespace $BENCHMARK_NAME $BENCHMARK_NAME zeebe-benchmark/zeebe-benchmark
helm upgrade --reuse-values --set zeebe.config.zeebe.broker.experimental.maxAppendsPerFollower=6 --set zeebe.config.zeebe.broker.experimental.maxAppendBatchSize=128KB --namespace $BENCHMARK_NAME $BENCHMARK_NAME zeebe-benchmark/zeebe-benchmark
helm upgrade --reuse-values --set zeebe.config.zeebe.broker.experimental.maxAppendsPerFollower=6 --set zeebe.config.zeebe.broker.experimental.maxAppendBatchSize=16KB --namespace $BENCHMARK_NAME $BENCHMARK_NAME zeebe-benchmark/zeebe-benchmark
helm upgrade --reuse-values --set zeebe.config.zeebe.broker.experimental.maxAppendsPerFollower=6 --set zeebe.config.zeebe.broker.experimental.maxAppendBatchSize=32KB --namespace $BENCHMARK_NAME $BENCHMARK_NAME zeebe-benchmark/zeebe-benchmark
Backpressure and processing queue size improves with bigger batch sizes. All other metrics look stable (memory & GC as well). Messaging latency degrades a bit, but that's expected considering we are sending payloads 2x-4x bigger.
Smaller batch size (16KB) (even though the payload in this scenario 32KB) is the worst of all on many metric, even though I expected to behave similarly to 32KB because of the payload size.
It may be fruitful to run the realistic benchmark as well with the configuration (32KB, 64KB, 128KB) as well.
Based on this benchmark configuration I would suggest to bump the default to 64KB or 128KB.
Started on 28/10/2025 at 09:50:
helm install --create-namespace --namespace $BENCHMARK_NAME $BENCHMARK_NAME zeebe-benchmark/zeebe-benchmark -f values-realistic-benchmark.yaml
helm upgrade --reuse-values --set zeebe.config.zeebe.broker.experimental.maxAppendsPerFollower=6 --set zeebe.config.zeebe.broker.experimental.maxAppendBatchSize=32KB --namespace $BENCHMARK_NAME $BENCHMARK_NAME zeebe-benchmark/zeebe-benchmark -f values-realistic-benchmark.yaml
helm upgrade --reuse-values --set zeebe.config.zeebe.broker.experimental.maxAppendsPerFollower=6 --set zeebe.config.zeebe.broker.experimental.maxAppendBatchSize=64KB --namespace $BENCHMARK_NAME $BENCHMARK_NAME zeebe-benchmark/zeebe-benchmark -f values-realistic-benchmark.yaml
helm upgrade --reuse-values --set zeebe.config.zeebe.broker.experimental.maxAppendsPerFollower=6 --set zeebe.config.zeebe.broker.experimental.maxAppendBatchSize=128KB --namespace $BENCHMARK_NAME $BENCHMARK_NAME zeebe-benchmark/zeebe-benchmark -f values-realistic-benchmark.yaml
In this scenario as well bigger batch sizes seem to improve "Backpressure" & "Processing queue size". It also seems that more processes are completed (even if it's quite noisy, so not sure if it's real):
Based on this benchmark configuration I would suggest to bump the default to 64KB or 128KB.
The above benchmark were run before this fix was added to the helm-charts: https://github.com/zeebe-io/benchmark-helm/pull/202, so it's probably better to rerun them.
Started on 30/10/2025 at 16:25: Dashboard
helm install --create-namespace --namespace $BENCHMARK_NAME $BENCHMARK_NAME zeebe-benchmark/zeebe-benchmark -f values-realistic-benchmark.yaml
NOTE: maxAppendsPerFollower is not needed anymore, as it's now the default value.
helm upgrade --reuse-values --set zeebe.config.zeebe.broker.experimental.maxAppendBatchSize=32KB --namespace $BENCHMARK_NAME $BENCHMARK_NAME zeebe-benchmark/zeebe-benchmark -f values-realistic-benchmark.yaml
helm upgrade --reuse-values --set zeebe.config.zeebe.broker.experimental.maxAppendBatchSize=64KB --namespace $BENCHMARK_NAME $BENCHMARK_NAME zeebe-benchmark/zeebe-benchmark -f values-realistic-benchmark.yaml
helm upgrade --reuse-values --set zeebe.config.zeebe.broker.experimental.maxAppendBatchSize=128KB --namespace $BENCHMARK_NAME $BENCHMARK_NAME zeebe-benchmark/zeebe-benchmark -f values-realistic-benchmark.yaml
Increasing the MAX_BATCH_SIZE increases the throughput by around 20%. This comes at the expense of increased latency (as expected) and more cpu usage, as it looks to be able to process more data.
Unfortunately, with the additional cpu usage the container starts throttling more frequently, exacerbating the latency degradation, so it would be better to see the results with more cpu dedicated how latency will behave.
For example, the Process instance execution time increases a lot, even though the rate of process instance completed is higher than the baseline.
Another interesting metric is the amount of data writte to disk, which looks much higher than before
:question: Is the load on the system constant or does it depend on how much throughput zeebe is able to process?
If the latter is true, then we're not really comparing the same setup, so it's expected to see such different numbers.
:question: However, it still poses some questions on wether zeebe is handling the additional load gracefully or with a different behaviour (such as timing out instances?)
:thought_balloon: If zeebe is behave correctly, probably increasing the buffer size to 64KB is sensible, while keeping in mind that in the future we may bump it to 128KB
None of these runs were able to complete process instances in time so I'd expect that there's an underlying issue that invalidates the benchmark results unfortunately.
That's what I feared. The baseline as well was having the same issue (even more actually).
Given these results I think it's even more important to repeat them.
@entangled90 please make sure to run helm repo update
to get the latest changes (recent version with fixes)
Thanks @Zelldon, I was using an outdated version of the benchmark without your fix.
This time I've run them in parallel in different namesapces cs-max-batch-${batch-size}
Here are the updated results:
I see that now the two rates are matched (50 PI/s per partition = 150) are handled in all three scenarios. When the buffer size has been increased however I see that the number of Job pushed & timedout is much higher.
:question: That may be related to the increased cpu usage & generally worse performance. Not sure if this is the cause or an effect of the increased cpu usage, but I wonder if we expected this metric to increase?
:question: could be a situation when a timeout is now breached more frequently, causing an "avalanche" effect, adding more load, which then increases the number of timeouts?
I see that the number of requests handled by Gateway is completely off in the benchmark for 64KB.
For the other two benchmarks the frequency of "Published Requests" is between:
After a review with @npepinpe we noticed that the Number of job push failed is too high, even in the base case. It's probably better to reconfigure the workers' resources in order to make them able to handle the load (in the base case)
Benchmarks will be rerun after https://github.com/zeebe-io/benchmark-helm/pull/204 is merged
Benchmarks will be rerun after zeebe-io/benchmark-helm#204 is merged
Make sure to create a new release first :) https://github.com/zeebe-io/benchmark-helm/blob/main/RELEASE.md
Rerun the benchmarks:
Benchmark are now much more stable. Backpressure is almost always 0% for all configurations. Observations:
cpu usage increases slightly when increasing the buffer size.
We may look into this as potentially bigger batches can be more "efficient" thus reducing cpu usage.
Processing queue size greatly improves when using 128KB buffers, while there's not a noticeable difference between 32KB and 64KB
Processing > number of records not exported: is almost zero with 128KB, while always greater than 0 in 32KB & 64KB configurations. ~~On 64KB configuration looks worse.
The ElasticSearch Exporter flush duration is worse on the 64KB case compared to the other two. The cpu usage of the elastic search pod is worse.
:wrench: My expectation would be that it should not be that different from the other two values (considering 64KB sits in between).~~
Benchmark repeated for 64KB 64KB confirms to be a bit better than 32KB on processing queue size as well.
Memory GC > Bigger batch sizes slitghtly increase the frequency and duration of GC. This is consistent with the increased cpu usage.
In general I think bigger batches can improve performances with slight increases in cpu usage. I will repeat the test with 64KB as it looks strange.
[!NOTE] Test was repeated and confirms above conclusions
:wrench: We may look into optimizing the cpu usage/memory allocation with bigger batches, to see if we can make it more efficient.
Looking at overall processing throughput and processing latency, I did not really notice an improvement in any of the benchmarks compared to the baseline. That said, it's possible we aren't pushing enough load to see if there are any improvements on those metrics.
While we see an improvement in the processing queue size, this means it did not really translate to actual improvements for the user. I also couldn't confirm the "number of records not exported: is almost zero" part. It looks like it's still some? That would be a marked improvement for user, as it means less delay when it comes to seeing new data in the web apps.
If we look at section 1, we see this is true - close to 0. But then section 2 shows the normal behavior. Are these separate runs?
No, they are the same run. I have tried to understand what happens at 09:10, but I could find an explanation.
Almost every metric has a dip at around 09:10:
Given that improvements in the metrics are small and increase the cpu usage it's probably better to keep the current default value. It could possible that with a different workload a different value of this setting may be beneficial (if more cpu usage is ok for the customer for the customer example).
Description
The MAX_BATCH_SIZE is used to limit how many data is sent as a batch to a follower. Currently it is static and set to 32 kb. It is likely that this configure can help to improve the throughput. We should evaluate and benchmark different values to get an better understand how it impacts the performance and what would be a good value.
Ideally we add new metrics which show how many data is sent per append, to see whether the max batch size is really used or we have other limits already.
Blocked by https://github.com/zeebe-io/zeebe/issues/5472