paritytech / polkadot-sdk

The Parity Polkadot Blockchain SDK
https://polkadot.com/
1.89k stars 696 forks source link

Use Glutton to stress test parachain consensus on Versi #620

Open sandreim opened 1 year ago

sandreim commented 1 year ago

After https://github.com/paritytech/parachains-utils/pull/1 is merged we are ready to deploy Glutton on all cores on Versi. It will be the first time we will collect data from Versi load testing with big parachain blocks that burn CPU and cause high network I/O.

As a testing strategy we will be doing 3 types of tests:

Testing environment: 300 parachain validators and 50 parachains. If it doesn't break, and we would benefit from collecting data at higher scale we might want to dial the numbers up to 500 validators and 70 parachains.

For each strategy we should follow an incremental approach which will allow us to observe metrics and logs.

NachoPal commented 1 year ago

@sandreim Deployment is being coordinated here: https://github.com/paritytech/devops/issues/2567

sandreim commented 1 year ago

Testing scenario 1

Observations

Initial conclusions

ggwpez commented 1 year ago

What Hardware are these validators running on?
That also plays into how much time the glutton consumes, since it orientates itself on something like i7-7700K.

Looking at https://github.com/paritytech/polkadot/pull/7409: There is a reconstruct benchmark cargo bench -p polkadot-erasure-coding and that reports 74ms for 200 validators on i9-13900K when changing it to 2.5 MiB proof.
Maybe you can try that on the validator hardware to compare?
PS: Just saw https://github.com/paritytech/reed-solomon-novelpoly/pull/2, no idea if that has any chance of landing.

sandreim commented 1 year ago

We are using https://cloud.google.com/compute/docs/general-purpose-machines#n2d_machines AFAIK. CC @PierreBesson to confirm. We try to use reference hardware as recommended https://wiki.polkadot.network/docs/maintain-guides-how-to-validate-polkadot#reference-hardware .

What hardware are we running weights benchmark these days?

ggwpez commented 1 year ago

It was still done on the old ref hardware, i7-7700K, but we recently updated to the recommendation from the Wiki https://github.com/paritytech/polkadot/pull/7342 for Cumulus the same thing is in the pipeline here https://github.com/paritytech/cumulus/pull/2712.

The CPU name is always at the top of the weight files to give a rough indication (old vs new):
image So the consumption should be closer once the Cumulus MR is merged. You could also set the CPU burn to 0% to just measure the overhead (dont know if that applies in this case).

burdges commented 1 year ago

30 vrf modulo samples

Is this meant to come from computation we discussed in

We should keep https://github.com/paritytech/polkadot-sdk/issues/640 in mind somewhat here

relayVrfModuloSamples = E * num_cores / num_validators = 100 needed approvals * 40 parachains / 200 validators = 20 < 30 vrf modulo samples

That's a fine deviation. Also, I guess the needed approvals is so high because we want to think in terms of some miss-behavior. Yet, we also care about maybe 120 parachains with 30 needed approvals.

Or maybe this type of configuration is a nice way to check the load of just the approvals system without running as many collators?

sandreim commented 1 year ago

Yes, the purpose is to do more load with less validators and parachains.