paritytech / polkadot-sdk

The Parity Polkadot Blockchain SDK
https://polkadot.network/
1.66k stars 593 forks source link

Custom child trie #2605

Open tdelabro opened 7 months ago

tdelabro commented 7 months ago

Hey,

After watching this talk by @shawntabrizi, I was under the impression that Substrate offered a smooth API in order to implement our own Trie logic. Like a trait I would be able to implement and that will be injected into the rest of the execution logic. The core feature being the way the root hash would be computed.

After looking through the code I'm now thinking that it is not possible without forking Substrate. The code is designed to only work with the specific struct used and not some abstracted interface. We can see it here: https://github.com/paritytech/substrate/blob/367dab0d4bd7fd7b6c222dd15c753169c057dd42/primitives/state-machine/src/trie_backend_essence.rs#L497 and here

```rust
pub enum ChildType {
/// If runtime module ensures that the child key is a unique id that will
/// only be used once, its parent key is used as a child trie unique id.
ParentKeyId = 1,
}

where there can be only one type of ChildTrie.

What is the situation for this feature. Is it something parity plan on implementing at some point? What is my best path if I need something like this right now?

burdges commented 7 months ago

We clearly need other storage types. You might check out Tuxedo, which should presumably be using both the substrate one for messaging, and something optimized for UTXOs. cc @JoshOrndorff

bkchr commented 7 months ago

Is it something parity plan on implementing at some point? What is my best path if I need something like this right now?

Maybe you could start with explaining what you need. Currently this is not supported and the code base is currently more evolving into a different direction. However, please explain what you need and then we can discuss it.

tdelabro commented 7 months ago

Maybe you could start with explaining what you need.

Sure. I'm working on Madara, which uses Substrate to build a Starknet chain, in the same way frontier does for Ethereum. Like them we have a wrapped block that we create by running tx in a VM inside a pallet, and push to the substrate digest.

Starknet specification also ask yo use a modified MPT for the multiple different storages we have. Here they are: https://docs.starknet.io/documentation/architecture_and_concepts/Network_Architecture/starknet-state/#merkle_patricia_tree.

This trie is of an other type than the Substrate one. Which mean that I have to build it myself at some other point in order to get it's root hash. So what I would like is to be able to tell to substrate that I want to use a substrie of this specific type and that when it computes it's root, it should use the implementation I provided. Then substrate can keep using it's default way to compute it's top level storage root. But at least it would spare the execution of a root I have no interest in for this specific subtree.

Also the spec is that the contracts storage is a tree, with each leave being the root hash of each single individual contract own trie. Meaning I would also need nested-child-tries in order to achieve that, which I believe are not implemented either.

So right now, what I will do is compute those roots values asynchronously, outside of the runtime. What is your opinion about storing all of my storage in a single ValueStorage, this way less time is lost navigating and computing the substrate MPT?

bkchr commented 7 months ago

What is your opinion about storing all of my storage in a single ValueStorage, this way less time is lost navigating and computing the substrate MPT?

Yeah, depending on the size of the individual trees, this is also something I would have proposed.

A proper way forward would be something like this: https://github.com/paritytech/polkadot-sdk/issues/245 So, you would store the individual items, but then build the trie in the runtime as you would do it with your one storage value right now, but hopefully more performant.

tdelabro commented 7 months ago

A proper way forward would be something like this: https://github.com/paritytech/polkadot-sdk/issues/245

Thanks for the resource. When is this planed for release?

burdges commented 7 months ago

It doesn't require anything from the relay chain. It's doable now by manually using static muts and unsafe, or spin::Mutex. We'll do this for batch verifiers for cryptography for example. https://github.com/Off-Narrative-Labs/Tuxedo/pull/105#issuecomment-1741538046

burdges commented 7 months ago

In this case, you should ask how parachains provide their own hostcalls which cannot be confused with relay chain PVF host calls.

tdelabro commented 7 months ago

@burdges I looked at the link and the PR but this is unclear to me. @bkchr solution seems to indicate that I have to create new host functions for my specific storage, which seems logic to me. Are you saying that no additional host function are needed?

Also I don't plan to be a parachain. Our settlement is done in an other way, using the provable nature of our VM execution.