Closed d4hines closed 2 years ago
I suggest the following course:
Having these two behind us will make it much easier to decide what and how to benchmark. They can be worked on in parallel.
We have a growing suite of benchmarks and collect data from our testnets. I think I can close this issue.
We want Deku to be fast. But first, we have to define what that means.
Theoretical TPS
The theoretical maximum TPS is highly connected to the choice of consensus protocol (see #242). TPS should be a strong consideration in our choice. This issue thus depends on at least settling on a design for #242.
But theoretical TPS is not everything. There are many other facets of performance. To name a few:
Some of these can be considered independently of the consensus protocol, but for the most part, to quote @AngryStitch, "everything parameterizes everything else".
Implementation Performance
Besides the theoretical maximum, the actual performance of the chain depends on its implementation. This can be measured with testing.
Microbenchmarks
https://github.com/janestreet/core_bench is a library for writing microbenchmarks - benchmarks of single functions executed many times to generate execution time and memory management statistics. After identifying our key bottleneck functions (from first principles or tested with
perf
, etc.), we should write benchmarks for these functions usingcore_bench
. Additionally we should set up CI to do this automatically and to generate reports for us to analyze. I have done background work towards this end for https://gitlab.com/tezos/tezos/-/merge_requests/2439 that perhaps we can re-use.Trace benchmarking
It would be great to be able to generate execution traces and benchmark the re-play of those traces. The reason is that microbenchmarks, while useful for optimizing specific functions, don't capture how various functions compose, and are thus not perfectly realistic.
When #116 is done, this will be a much easier task. Whether or not we can use
core_bench
for trace benchmarking depends on the length of traces and the performance of our interpreter -core_bench
is specialized to very small executions.Real node testing
Ideally, we would have automations for deploying networks of nodes and benchmarking their actual performance under varying conditions. This is the only way to approximate the actual performance, as isolated unit tests lose too many variables.