paritytech / polkadot-sdk

The Parity Polkadot Blockchain SDK
https://polkadot.network/
1.78k stars 645 forks source link

Metadata V16: Introduce a `Core_hash` runtime API for hashing extrinsics #5468

Closed lexnv closed 1 week ago

lexnv commented 3 weeks ago

While the collection of associated types in the metadata V16 will expose information like "what is the name of the hasher", the metadata is not capable of exposing code functionality yet. At the moment, the https://github.com/paritytech/polkadot-sdk/issues/4714 WASM blob approach looks complicated and is unclear if this is going to make the cut into v16 or v17.

It would be beneficial (and easier) for users like PAPI and Subxt to call into a runtime API function that hashes the given bytes with the same hasher of the chain.

Firstly mentioned in: https://github.com/paritytech/polkadot-sdk/issues/4520#issuecomment-2236855735

Thanks for the suggestion and feedback @josepot and @bkchr 🙏

xlc commented 3 weeks ago

I am not sure if this is a good idea and if this problem should be addressed at runtime API or metadata level. In any case, it will be super inefficient.

josepot commented 3 weeks ago

it will be super inefficient

It's meant to be used as a fallback, just in case that the library (papi, subxt, pjs, etc) doesn't support the "name of the hasher" described by the metadata. IMO it shouldn't be that inefficient if the DApp is built using the light-client, because the WASM of the runtime should be compiled already and running either in a worker or in the same thread...

xlc commented 3 weeks ago

ok. I can see where this could be useful. Still unsure if the amount of change required is justifiable for the use cases but I won't complain if someone decided to do this.

burdges commented 3 weeks ago

Are we talking about hashing for merkle trees or for leaves, or both? Are you looking at blocks or state, or both?

We'll need abstraction for sure, but not sure when we'll understand the model, or how well PAPI and Subxt can even explore the stranger applications.

In future, our regular merkle trees should switch to bare compression functions, like blake3_compress([u8;64], tree_pos) -> [u8; 32], which you'd usually invoke in parallel batches for SSLE. See NOMT.

Any optimized conventional tree abandons the internal nodes, and adopts radix 2, so that's a pretty fdifferent merkle proof. We've discussed a flavor that employs an off-chain unmerkalized index, and gives extremely flat trees, but costs three merkle proofs during new insertions, which even further alters the merkle proofs.

Zcash-like Pedersen hashes would be extremely expensive, but zk stuff favors Posideon these days, which uses arithmatic mod a 256 bit prime, and always has radix 4 or 8. It only gets stranger after that.

In future, some block would encorporate batching of transaction details, like signatures, but maybe merkel proofs using stranger schemes. This means users cannot verify some aspects of a block without downloading the whole block, although you could always do before & after state proofs.

You maybe what some higher level interface here?

bkchr commented 3 weeks ago

I also don't see this necessary. I know where you are coming from. I was there 5 years ago. The truth is, that people don't use any kind of niche/unknown hashing function.

Using the name of the hashing function will be good enough.

josepot commented 3 weeks ago

sorry, I was under the wrong impression that exposing the hashing function through a runtime-API was going to be a super easy change. If that's not the case, then I can totally live with having a list of supported hashing functions (Keccak-256 and Blake2-256 for now) and make it grow if need be, no problem. As long as I can introspect it from a property of the metadata, I'm good.

bkchr commented 3 weeks ago

sorry, I was under the wrong impression that exposing the hashing function through a runtime-API was going to be a super easy change.

I mean we could generate it with the macros or whatever. However, then you always need to call into wasm to hash something and whatever. We could still do this in the future when there are so many different hashing functions. But I doubt that this will happen.

lexnv commented 1 week ago

Thanks for the input everyone! Closing this in favor of: https://github.com/paritytech/polkadot-sdk/issues/4519, feel free to reopen if needed 🙏