Closed ordian closed 7 months ago
i can confirm this, collator is producing 1MB+ proofs for as long i can see in collator logs (2 weeks). do you have historical data about PoV size?
I can only say it was still the case in July but don't have historical data below that.
Might be related: https://github.com/paritytech/polkadot-sdk/issues/1498
@cheme do you have a radix 2 flavor of https://github.com/paritytech/trie or whatever somewhere?
@cheme do you have a radix 2 flavor of https://github.com/paritytech/trie or whatever somewhere?
Had this branch from 2020 https://github.com/paritytech/trie/pull/84, but today it is a lot behind current trie crate (probably 1 or 2 days of work to update).
it looks like we are attaching code wasm in every block proof.
this is example list of tries we are building the proof from:
[
''... 2538366 more characters,
'5f08dcde934c658227ee1dfafcd6e16903050108dc4d79aad5a9d01a359995838830a80733a0bff7e4eb087bfc621ef1873fec49be4f21c56d926b91f020b5071f14935cb93f001f1127c53d3eac6eed23ffea64',
'5f0a42f33323cb5ced3b44dd825fda9fcc804545454545454545454545454545454545454545454545454545454545454545',
'5f0e0621c4869aa60c02be9adcc98a0d1d050108dc4d79aad5a9d01a359995838830a80733a0bff7e4eb087bfc621ef1873fec49be4f21c56d926b91f020b5071f14935cb93f001f1127c53d3eac6eed23ffea64',
'764704b568d21667356a5a050c118746b4def25cfda6ef3a00000000804545454545454545454545454545454545454545454545454545454545454545',
'7d0bce545fb382c34570e5dfbf338f5e4e7b9012096b41c4eb3aaf947f6ea429080000',
'7e0f0c53fa332d4d9712c66fd92efcb64e7b9012096b41c4eb3aaf947f6ea429080000',
'7e1467a096bcd71a5b6a0c8155e208104e7b9012096b41c4eb3aaf947f6ea429080000',
'7e3237373ffdfeb1cab4222e3b520d6b4e7b9012096b41c4eb3aaf947f6ea429080200',
'7e323df7cc47150b3930e2666b0aa3134e7b9012096b41c4eb3aaf947f6ea429080200',
...
]
1st one contains our runtime code wasm:
🗜 Compressed: Yes, 78.78%
✨ Reserved meta: OK - [6D, 65, 74, 61]
🎁 Metadata version: V14
🔥 Core version: hydradx-178 (hydradx-0.tx1.au1)
🗳️ system.setCode hash: 0x64c439e579c3bfff9f4ebb8be01ca8a33f5c6f565c42531b46011974a9f79c93
🗳️ authorizeUpgrade hash: 0xa059f2c663f68b95f2e72ad34e2ff34569706ebee1c6fe74c519e847eb5dab3a
#️⃣ Blake2-256 hash: 0x32dc435cbda2592facebf36852feb2ec411f7b77cd33a9ec8ba109cb579a7cb9
📦 IPFS: https://www.ipfs.io/ipfs/QmU5Lw394PxSziP6vMNH7B2UdFn4XZfXVKQrG2hXG4NELk
why do we attach it in storage proof?
Well you are probably using trie_version 0 (maybe state_version), with version 1 the value is not attach to the node and only include if accessed by the runtime. WARNING: switching requires a migration (or warpsync will be broken).
So if using state_version 0, maybe you query an entry close to key ":code" eg ":codex" that would include the node (and its value) at ":code" into your proof. so any query to key starting with ":code", and some insert or removal of key close to :code (which may result in changing the node prefix at :code and thus touching the node), would include the wasm in the proof.
1st one appears to contain wasm file
yes wasm is in the top trie at key ':code' (utf8 values). would make sense to trace all runtime access during block processing and try to find what key query can touch ':code' trie node (I don't remember how it can be done but maybe with try-runtime and some logging (cannot check right now), if missing traces can be added on sp-io storage function or sp-state-machine trie accesses directly.
With state_version set to 1.
2023-09-18 16:40:12 [Parachain] PoV size { header: 0.1787109375kb, extrinsics: 2.865234375kb, storage_proof: 4.306640625kb } 2023-09-18 16:40:12 [Parachain] Compressed PoV size: 6.330078125kb
So I guess, we should consider migrating.
Would have some details how to do that?
There is a link to the md guide and lot of link to different progress on relay chain https://github.com/paritytech/devops/issues/1508#issuecomment-1271565180 . Note that the migration process for parachain may seems more complicated than for a relay (progress by adding extrinsic in each block manually), to avoid going over the block size (with the automatic process used in relay chain there is always a risk that the content of the chain will include a very big value on top of an already big proof). But from my point of view a parachain could still audit cautiously its content and assert such scenario will not happen (even possibly run some value migration ahead and skip them afterward in the automatic migration process: but this is not currently coded in the state-migration pallet). Generally, the migration process is not something complicated, it just requires that every key value in the runtime get written again once (if you look at the state migration automatic process, we just store a progress key and advance a few value at the start of every blocks).
There is a link to the md guide and lot of link to different progress on relay chain https://github.com/paritytech/devops/issues/1508#issuecomment-1271565180 .
sry, this link is broken for me, is it in private repo?
:facepalm: yes it is a private one, the link to the guide was https://hackmd.io/JagpUd8tTjuKf9HQtpvHIQ (a post refering to it https://forum.polkadot.network/t/state-trie-migration/852).
Solved by #799 and running migration on wasm
gm
I wonder if you're aware of the PoV sizes of the blocks produced by both Basilisk on Kusama and HydraDX on Polkadot. Our scraping services show that the PoV contribution of each of your parachain is around 8GB per day, which is more than 1MB per PoV block on average (this is way above any other parachain).
Are you aware of this or know that might be contributing to the PoV size? (can also probably see the collators logs to confirm)
Given that there not enough extrinsics in the blocks to explain that, is it possible that you for example do some iteration in every block, reading some state? If not, that could indicate a bug in state proof recorder that includes some data it shouldn't.