Open shawntabrizi opened 3 years ago
IIUC, instead of Generate "full storage" before running benchmarks
we now prefer the runtime to give for each storage the depth of the node of the values, and the maximum encoded size of the value (related issue tracker https://github.com/paritytech/substrate/issues/8719)
I think runtime can provide a storage description and benchmarks will make use of it to make a proper estimation of the PoV size for calls. The storage description can look like this:
struct NodeDescription {
/// The maximum size of the value of the node
max_value_size: usize,
/// The depth of the node in the trie
max_node_depth: usize,
}
struct StorageDescription {
/// Associate a node description for all key starting with a specific prefix.
// E.g. vec![
// (
// twox128(System)++twox128(Account),
// Node {
// max_value_size: BoundedEncodedLen::of(AccountId),
// max_node_depth: log16(number_of_pallet_in_runtime) + log16(number_of_storage_in_pallet) + log16(number_of_key_in_account_storage)
// },
// )
// ]
prefix_description: Vec<(Prefix, NodeDescription)>,
/// Associate a node description for a specific key.
// E.g. for ":code:" key
key_description: Vec<(Key, NodeDescription)>,
}
So we need a way to give the number of key in a storage. probably helped by the pallet macro with a new attribute #[pallet::max_size(N)]
or something like this.
EDIT: probably not needed as we can overestimate a bit and adjust once the transaction is proceed. And we can improve in the future.
or maybe we want something more precise than max_node_depth: log16(number_of_pallet_in_runtime) + log16(number_of_storage_in_pallet) + log16(number_of_key_in_account_storage)
like:
depth_after_prefix: log16(number_of_key_in_account_storage)
depth_before_prefix: log16(number_of_pallet_in_runtime) + log16(number_of_storage_in_pallet)
So that if the storage is queried multiple time the size related to the depth_before_prefix should not be added and the size related to the depth_after_prefix should amortised
Maybe to allow even more description we should have something nested:
so that twox128(System)
prefix has a depth, and contains multiple possible suffix (one for each storage), one of them is twox128(AccountId)
.
So that at the end if you query 2 values from system pallet but 2 different storage then you don't need to count 2 times for the node to get the twox128(System)
prefix.
we need a way to give the number of key in a storage. probably helped by the pallet macro with a new attribute #[pallet::max_size(N)] or something like this.
I believe we only need that for StorageMap
, StorageDoubleMap
, StorageNMap
, etc. Spitballing earlier, we had the idea that for backwards-compatibility, if a max_size
attribute is not included, we could use a default value of 300_000. This should ease the migration.
maybe we want something more precise than
max_node_depth: log16(number_of_pallet_in_runtime) + log16(number_of_storage_in_pallet) + log16(number_of_key_in_account_storage)
I think it's probably not worth getting too elaborate with the estimate. Given that it's cheap and easy to know the actual POV size and refund the weight once a tx has been processed, we can probably stick with a simple upper bound for POV size estimates.
This is a meta issue to track the things needed to add benchmarking support for PoV size, critical for launching Parachains on Polkadot.
MaxEncodedLen
requirement to storage (meta issue: https://github.com/paritytech/substrate/issues/8719) https://github.com/paritytech/substrate/pull/8735BoundedVec
to Storage Primitives https://github.com/paritytech/substrate/pull/8556Vec
toBoundedVec
(meta issue: https://github.com/paritytech/polkadot-sdk/issues/323)(computation_weight, pov_weight)
and abstractions https://github.com/paritytech/polkadot-sdk/issues/256