Open sandreim opened 1 year ago
@sandreim Deployment is being coordinated here: https://github.com/paritytech/devops/issues/2567
Testing scenario 1
Observations
reconstructed_data_matches_root
that we don’t really time separately reconstructed_data_matches_root
heavy call that is used on all paths - https://github.com/paritytech/polkadot/pull/7409reconstructed_data_matches_root
is called as part of it)Initial conclusions
What Hardware are these validators running on?
That also plays into how much time the glutton consumes, since it orientates itself on something like i7-7700K.
Looking at https://github.com/paritytech/polkadot/pull/7409:
There is a reconstruct benchmark cargo bench -p polkadot-erasure-coding
and that reports 74ms for 200 validators on i9-13900K when changing it to 2.5 MiB proof.
Maybe you can try that on the validator hardware to compare?
PS: Just saw https://github.com/paritytech/reed-solomon-novelpoly/pull/2, no idea if that has any chance of landing.
We are using https://cloud.google.com/compute/docs/general-purpose-machines#n2d_machines AFAIK. CC @PierreBesson to confirm. We try to use reference hardware as recommended https://wiki.polkadot.network/docs/maintain-guides-how-to-validate-polkadot#reference-hardware .
What hardware are we running weights benchmark these days?
It was still done on the old ref hardware, i7-7700K
, but we recently updated to the recommendation from the Wiki https://github.com/paritytech/polkadot/pull/7342 for Cumulus the same thing is in the pipeline here https://github.com/paritytech/cumulus/pull/2712.
The CPU name is always at the top of the weight files to give a rough indication (old vs new):
So the consumption should be closer once the Cumulus MR is merged. You could also set the CPU burn to 0% to just measure the overhead (dont know if that applies in this case).
30 vrf modulo samples
Is this meant to come from computation we discussed in
We should keep https://github.com/paritytech/polkadot-sdk/issues/640 in mind somewhat here
relayVrfModuloSamples = E * num_cores / num_validators = 100 needed approvals * 40 parachains / 200 validators = 20 < 30 vrf modulo samples
That's a fine deviation. Also, I guess the needed approvals is so high because we want to think in terms of some miss-behavior. Yet, we also care about maybe 120 parachains with 30 needed approvals.
Or maybe this type of configuration is a nice way to check the load of just the approvals system without running as many collators?
Yes, the purpose is to do more load with less validators and parachains.
After https://github.com/paritytech/parachains-utils/pull/1 is merged we are ready to deploy Glutton on all cores on Versi. It will be the first time we will collect data from Versi load testing with big parachain blocks that burn CPU and cause high network I/O.
As a testing strategy we will be doing 3 types of tests:
Testing environment: 300 parachain validators and 50 parachains. If it doesn't break, and we would benefit from collecting data at higher scale we might want to dial the numbers up to 500 validators and 70 parachains.
For each strategy we should follow an incremental approach which will allow us to observe metrics and logs.