Closed drandreaskrueger closed 2 years ago
On a dockerized crux-quorum with 4 nodes.
Surprise: web3 turned out to be a huge bottleneck now!
When not using web3 transaction calls but direct RPC calls, I see considerable TPS rate improvements (from today's previous record ~273 TPS):
initially
over 450 TPS !!!
(but only during the first ~14,000 transactions, then it drops to ~270 TPS, mysteriously. Any ideas, anyone?)
Hey @drandreaskrueger looks great. As to the drop off at 14k txns, since you are already tinkering with the cli options for geth, please look into these as well:
PERFORMANCE TUNING OPTIONS:
--cache value Megabytes of memory allocated to internal caching (default: 1024)
--cache.database value Percentage of cache memory allowance to use for database io (default: 75)
--cache.gc value Percentage of cache memory allowance to use for trie pruning (default: 25)
--trie-cache-gens value Number of trie node generations to keep in memory (default: 120)
These are from: https://github.com/ethereum/go-ethereum/wiki/Command-Line-Options. Also, for the report, might be good to also keep track of queued txns.
Thanks a lot.
I have now tried
--cache 4096 --trie-cache-gens 1000
but no change in behavior. Sudden TPS drop around 14k transactions, look at TPS_current
:
block 108 | new #TX 415 / 1000 ms = 415.0 TPS_current | total: #TX 9503 / 22.4 s = 424.9 TPS_average
block 109 | new #TX 437 / 1000 ms = 437.0 TPS_current | total: #TX 9940 / 23.3 s = 426.4 TPS_average
block 110 | new #TX 516 / 1000 ms = 516.0 TPS_current | total: #TX 10456 / 24.6 s = 425.7 TPS_average
block 111 | new #TX 509 / 1000 ms = 509.0 TPS_current | total: #TX 10965 / 25.2 s = 434.6 TPS_average
block 112 | new #TX 411 / 1000 ms = 411.0 TPS_current | total: #TX 11376 / 26.2 s = 434.3 TPS_average
block 113 | new #TX 480 / 1000 ms = 480.0 TPS_current | total: #TX 11856 / 27.4 s = 432.0 TPS_average
block 114 | new #TX 509 / 1000 ms = 509.0 TPS_current | total: #TX 12365 / 28.4 s = 435.4 TPS_average
block 115 | new #TX 381 / 1000 ms = 381.0 TPS_current | total: #TX 12746 / 29.1 s = 438.7 TPS_average
block 116 | new #TX 411 / 1000 ms = 411.0 TPS_current | total: #TX 13157 / 30.3 s = 434.3 TPS_average
block 117 | new #TX 482 / 1000 ms = 482.0 TPS_current | total: #TX 13639 / 31.3 s = 436.1 TPS_average
block 118 | new #TX 507 / 1000 ms = 507.0 TPS_current | total: #TX 14146 / 32.5 s = 434.7 TPS_average
block 119 | new #TX 250 / 1000 ms = 250.0 TPS_current | total: #TX 14396 / 33.2 s = 433.7 TPS_average
block 120 | new #TX 211 / 1000 ms = 211.0 TPS_current | total: #TX 14607 / 34.1 s = 427.9 TPS_average
block 121 | new #TX 282 / 1000 ms = 282.0 TPS_current | total: #TX 14889 / 35.4 s = 420.8 TPS_average
block 122 | new #TX 288 / 1000 ms = 288.0 TPS_current | total: #TX 15177 / 36.3 s = 417.7 TPS_average
block 123 | new #TX 294 / 1000 ms = 294.0 TPS_current | total: #TX 15471 / 37.0 s = 418.1 TPS_average
block 124 | new #TX 280 / 1000 ms = 280.0 TPS_current | total: #TX 15751 / 38.3 s = 411.6 TPS_average
block 125 | new #TX 256 / 1000 ms = 256.0 TPS_current | total: #TX 16007 / 39.2 s = 408.1 TPS_average
block 126 | new #TX 251 / 1000 ms = 251.0 TPS_current | total: #TX 16258 / 40.2 s = 404.4 TPS_average
block 127 | new #TX 282 / 1000 ms = 282.0 TPS_current | total: #TX 16540 / 41.2 s = 401.7 TPS_average
block 128 | new #TX 288 / 1000 ms = 288.0 TPS_current | total: #TX 16828 / 42.4 s = 396.6 TPS_average
block 129 | new #TX 220 / 1000 ms = 220.0 TPS_current | total: #TX 17048 / 43.4 s = 393.1 TPS_average
block 130 | new #TX 277 / 1000 ms = 277.0 TPS_current | total: #TX 17325 / 44.3 s = 391.0 TPS_average
same observation also in geth v1.8.13 (not only in quorum)
any new ideas about that?
You can now super-easily reproduce my results, in less than 10 minutes, with my Amazon AMI image:
https://gitlab.com/electronDLT/chainhammer/blob/master/reproduce.md#readymade-amazon-ami
any new ideas about that?
@drandreaskrueger @fixanoid Any updates on why the TPS drop occurs around 14K?
Thanks :)
@drandreaskrueger Is this result for AWS consistent, or it was a one-time feat?
peak TPS_average is 536 TPS, final TPS_average is 524 TPS.
Last time I checked, the problem was still there.
But it seems to be caused upstream, because look at this:
https://github.com/ethereum/go-ethereum/issues/17447#issuecomment-431629285
It happens in geth too!
Perhaps you can help them to find the cause?
That's a good idea. We'll look into it too after the upgrade to 1.8.18.
Cool, thanks.
There will soon be a whole new version of chainhammer, with much more automation.
Stay tuned ;-)
@drandreaskrueger is the AWS result with the web3 lib? Did you try with direct RPC calls(as you mentioned that web3 causes a lot of damage to the TPS)? If not I will give it a try.
I had tried both, via web3 and via direct RPC calls. The latter was usually faster, so I have done all later measurements with RPC calls.
The old code is still there though, and the switch is here, so you can simply try yourself: https://github.com/drandreaskrueger/chainhammer/blob/223fda085aad53c1cbf4c46c336ad04c2348da82/hammer/config.py#L40-L41
You can also read this: https://github.com/drandreaskrueger/chainhammer/blob/master/docs/FAQ.md
it links into the relevant code pieces
@jpmsam
after the upgrade to 1.8.18.
Oh, oops - I have been missing a lot then. But why v1.8.18 - your release page talks about 2.2.1?
Still doing all my benchmarks with a Quorum version that calls itself Geth/v1.7.2-stable-d7e3ff5b/linux-amd64/go1.10.1
...
... because I am benchmarking quorum via the excellent dockerized 4 nodes setup created by blk-io, see here which is less heavy than your vagrant virtualbox 7 nodes setup. I suggest you have a look at that dockerized version, perhaps you can publish something similar. Or do you have a dockerized Quorum setup by now?
For all my benchmarking, I could find dockerized versions of Geth, Parity, and Quorum - and blk-io/crux is the one I am using for quorum.
I have just published a brand new version v55: https://github.com/drandreaskrueger/chainhammer/#quickstart
Instead of installing everything to your main work computer, better use (a virtualbox Debian/Ubuntu installation or) my Amazon AMI to spin up a t2.medium
machine, see docs/cloud.md#readymade-amazon-ami.
Then all you need to do is:
networks/quorum-configure.sh
CH_TXS=50000 CH_THREADING="threaded2 20" ./run.sh "YourNaming-Quorum" quorum
and afterwards check results/runs/
to find an autogenerated results page, with time series diagrams.
Hope that helps! Keep me posted please.
Looks great. What is the performance with 100 nodes?
What is the performance with 100 nodes?
Just try it out.
I am importing the /blk-io_crux/docker/quorum-crux
project here:
https://github.com/drandreaskrueger/chainhammer/blob/49a7d78543b9f26e9839286c7f8c73851a18ca52/networks/quorum-configure.sh#L3-L12
If you look into their details, extending this from 4 nodes to 100 nodes looks doable, just tedious: https://github.com/blk-io/crux/blob/eeb63a91b7eda0180c8686f819c0dd29c0bc4d46/docker/quorum-crux/docker-compose-local.yaml
It would have to be a very large machine. And I would not expect huge changes. This type of distributed ledger technology doesn't get faster by plugging in more nodes, no?
Has anyone tried to test the latest geth
version? https://www.reddit.com/r/ethereum/comments/fqk8vm/transaction_propagation_optimization_in_geth_1911/
IBFT seems to max out around 200 TPS when run in the 7 nodes example.
--> see these results
However, the original publication is talking about 800 TPS with Istanbul BFT. How did they do it?
Any ideas how to get this faster?
Thanks!