Closed alindima closed 5 months ago
There are a couple of ways to implement this:
Now, the caveat of both approaches is that they are an optimisation that's only effective while not all validators are upgraded. Once they're all upgraded, the code will be redundant and would potentially send/record unnecessary events. Moreover, production networks rarely do chunk recovery for now. Most of the time they simply fetch the full data from backers (since most POVs are less than 128Kib in compressed size).
In the worst case, with a mixed validator set (half updated, half unupdated), the updated nodes will make an extra round-trip when fetching chunks from unupdated nodes.
I measured this in practice and the cost is negligible considering total POV recovery time.
Measuring this with subsystem-bench (with an extra latency of 100ms for the second request):
The first half simulates all nodes making 2 round trips for all chunk requests.
I also measured this in versi, with 50 validators and 9 glutton parachains and POVs of 2.5 mib.
The average PoV recovery time with all unupgraded nodes is 528ms. The average PoV recovery time will half upgraded and half unupgraded nodes is 674 ms.
As you can see, the large consumer of recovery time is reed-solomon.
Considering all the above, I'll close this issue and conclude that this small optimisation is not worth implementing
Prerequisite: https://github.com/paritytech/polkadot-sdk/pull/1644
See https://github.com/paritytech/polkadot-sdk/pull/1644#issuecomment-1916468621 for details of the improvements