This issue is to record several rounds of testing of Last Finalized State pre release versions. Tests are more performance oriented and some of the bug fixes and optimitations are done between each test setup.
Results are starting from the last setup. Shows how speed is increased with stronger machine. This is due to additonal computation needed for verification of the Rholang state (#3145). This is visible in high usage of CPU during sync.
Another important factor is disk speed. All tests are done on SSD disks and on stronger machines faster speed reduces overall sync duration.
Testing setup 4
This round is almost identical to setup 3 with additional optimization in receiving blocks (#3243) which include block sorting as part of download and with testing on 32 vCPU machine.
NOTE: CPU and memory observed in testing is the same as in setup 3, only overall duration is shorter.
[1] CPU, memory - 32vCPU 64GB (DigitalOcean)
Testing setup 3
First two rounds of testing showed that LFS source node disk reading speed can make significant difference in overall speed for nodes to download Rholang state. In this testing setup, to make reading from disk much faster RSpace folder on source node is converted to RAM disk which reduced reading from ~20sec to ~1sec for chunk of state sent over the network.
Observer node with empty state (bootstrap from 1.)
Observer node with empty state (bootstrap from 2.)
~Observer nodes with empty state (2. and 3.) must have higher limit for direct buffer memory to 3GB.
-XX:MaxDirectMemorySize=3g~
Memory leak with direct buffer is fixed in transport layer #3239 so now limit can be much lower.
-XX:MaxDirectMemorySize=200m
First operation is to test syncing of LFS from the full node. When this is done second syncing should use this node with trimmed state as a source to sync the third node.
Operation
Expected
Measured
1. -> 2.
4-6 hours
7.5 hours (one machine + 1 active nodes) 11 hours (one machine + 4 active nodes)
Overview
This issue is to record several rounds of testing of Last Finalized State pre release versions. Tests are more performance oriented and some of the bug fixes and optimitations are done between each test setup.
Results are starting from the last setup. Shows how speed is increased with stronger machine. This is due to additonal computation needed for verification of the Rholang state (#3145). This is visible in high usage of CPU during sync.
Another important factor is disk speed. All tests are done on SSD disks and on stronger machines faster speed reduces overall sync duration.
Testing setup 4
This round is almost identical to setup 3 with additional optimization in receiving blocks (#3243) which include block sorting as part of download and with testing on 32 vCPU machine.
RNode Docker image version:
rchain/rnode:v0.9.26-rc
(fully catch up)
NOTE: CPU and memory observed in testing is the same as in setup 3, only overall duration is shorter.
[1] CPU, memory - 32vCPU 64GB (DigitalOcean)
Testing setup 3
First two rounds of testing showed that LFS source node disk reading speed can make significant difference in overall speed for nodes to download Rholang state. In this testing setup, to make reading from disk much faster RSpace folder on source node is converted to RAM disk which reduced reading from ~20sec to ~1sec for chunk of state sent over the network.
RNode Docker image version:
tgrospic/rnode:v0.9.26-rc
NOTE: Bug with memory leak on network errors is still not resolved and one test with 8GB machine failed not listed in results.
(fully catch up)
test-3-cpx51-32g-full-ok-3.zip
test-3-cpx41-16g-full-ok-1.zip
test-3-cpx31-8g-full-ok-1.zip
Observations from testing
[1] CPU, memory - 16vCPU 32GB CPX51
[1] Direct buffer - 16vCPU 32GB CPX51
[2] CPU, memory - 8vCPU 16GB CPX41
[3] CPU, memory - 4vCPU 8GB CPX31
Survived network errors without crashing - 4vCPU 8GB CPX31
Testing setup 2
RNode Docker image version:
tgrospic/rnode:v0.9.26-beta
(fully catch up)
rnode-restart-error.zip
(Docker on Hetzner cloud CPX31)
rnode-test-3.zip
(Docker on Hetzner cloud CPX31)
rnode-test-4.zip
(Docker on Hetzner cloud CX31)
Observations from testing
[3] After node received LFS, it's not responding to requests for tuple space (StateItems). After restart it works as expected.
[2] CPU, memory
[2] Direct buffer memory
Testing setup 1
~Trie traversal is slow when number of records is bigger then 3,000 and it's rising faster then linear.~ Resolved in PR #3099.
Performance of LFS should be tested with this configuration.
RNode Docker image for all nodes:
rchain/rnode:v0.9.26-alpha
1.
)2.
)~Observer nodes with empty state (2. and 3.) must have higher limit for direct buffer memory to 3GB.
-XX:MaxDirectMemorySize=3g
~ Memory leak with direct buffer is fixed in transport layer #3239 so now limit can be much lower.-XX:MaxDirectMemorySize=200m
First operation is to test syncing of LFS from the full node. When this is done second syncing should use this node with trimmed state as a source to sync the third node.
11 hours (one machine + 4 active nodes)