paritytech / polkadot-sdk

The Parity Polkadot Blockchain SDK
https://polkadot.network/
1.7k stars 604 forks source link

Signed Metadata #291

Closed Slesarew closed 2 months ago

Slesarew commented 2 years ago

Currently the Signer needs trusted authority to verify correctness of metadata used. We could make this trustless if we ~require~ allow (edit 1: Signer should not be allowed to sign transactions with trivial blank metadata hash then) every transaction to include metadata hash into signed payload - this way the chain will be constantly verifying metadata validity instead of a trusted party.

This bears very little additional costs, deprecates lots of complex and confusing logic in Signer and makes automated generation of updates possible. Metadata portal might be reduced to a simple untrusted page with generator script since any attack on it will lead to Signer's DoS which is desired behavior.

bkchr commented 2 years ago

This is about creating a signed extension that passes the metadata hash as additional signed data to the signature creation.

The signed extension should be fairly simple to implement, the problem for now is how to get the metadata hash in an efficient way.

kirushik commented 2 years ago

In general I like the idea; this would also probably superseed the transaction_version field.

What worries me, though, is how is it going to work "at margins" for transactions around the runtime upgrade enacted: will we invalidate the whole transaction pool immediately on each minor runtime upgrade? Will we have a grace period around this, accepting both? Would we allow issuing transactions for not-yet-accepted runtime in the pool (probably waiting for the relevant upgrade to be enacted)?

Slesarew commented 2 years ago

We could match metadata to block mentioned in mortality for mortal transactions. Not for immortal ones though. Are immortal ones important?

bkchr commented 2 years ago

In general I like the idea; this would also probably superseed the transaction_version field.

Not really, a runtime upgrade, aka changing metadata doesn't mean the transaction version changed.

What worries me, though, is how is it going to work "at margins" for transactions around the runtime upgrade enacted: will we invalidate the whole transaction pool immediately on each minor runtime upgrade? Will we have a grace period around this, accepting both? Would we allow issuing transactions for not-yet-accepted runtime in the pool (probably waiting for the relevant upgrade to be enacted)?

We are already rejecting all transaction from a previous runtime version. CheckSpecVersion is doing that.

xlc commented 2 years ago

The real question is how to have the signer verify the metadata hash trustlessly? It needs to run a light client? But I am not sure if it is possible for air gapped wallet?

We also needs to commit metadata hash to onchain state, which sounds like a complicated task to me. You will want to do it with on_runtime_upgrade instead of a separate tx, so you will need to build a runtime, get metadata hash, and embed it into the runtime.

bkchr commented 2 years ago

The real question is how to have the signer verify the metadata hash trustlessly? It needs to run a light client? But I am not sure if it is possible for air gapped wallet?

The point of this being that the signer doesn't need to be able to verify the metadata at all. It will build the tx, based on the metadata that the user provided. The signature will then include the metadata hash, similar as we do it for spec_version. The runtime will then also add the metadata hash and if that uses a different hash, the signature verification will fail.

We also needs to commit metadata hash to onchain state, which sounds like a complicated task to me. You will want to do it with on_runtime_upgrade instead of a separate tx, so you will need to build a runtime, get metadata hash, and embed it into the runtime.

We could just include the metadata hash as part of the build process. Build the runtime once, get the metadata, generate the hash, build the runtime again and include the hash.

kianenigma commented 2 years ago

I'm kinda inclined to make this happen on the runtime and give a push to the ecosystem (Polkadot and Kusama included) to catch up. I think this will instigate more novel ways to distribute the metadata as well. Seems like a solvable problem, just stuck in a stalemate.

jak-pan commented 2 years ago

Is there any progress on this?

It looks like this will greatly enhance UX for everybody including developers and users. I'm currently working on implementation for HydraDX and Basilsik and we would love to see more widespread use of Parity signer so users can seamlessly use it between the parachains.

jsdw commented 1 year ago

In subxt we generate "shape based hashes" of metadata, down to individual calls, constants, storage entries etc (but also at the level of pallets and the entire metadata). See https://github.com/paritytech/subxt/blob/e711dc15829adf59b72aa7f3900ba8305eaf76fb/metadata/src/utils/validation.rs#L354.

These hashes are "shape based" because they ignore the specific type IDs of things, meaning that a hashing eg a pallet will keep returning the same result even if you strip a bunch of other stuff out of the metadata (doing so would shift around all of the type IDs).

The hashing we do in Subxt doesn't need to be quite so security conscious; it just exists to do a rough check that the code Subxt generates to interact with a node actually lines up with the node's current metadata (hence we use eg twox hasher rather than blake2 for speed).

The pros of this per-pallet hashing is that metadata can be stripped down, and only the relevant parts of it distributed to signers or whatever, all while the hashes generated continue to be consistent and can be verified.

So the rough approach using this sort of hashing would be:

If this is desirable, my team could:

Other notes:

From there, hopefully writing a signed extension isn't too tricky (we have no experience doing that yet but might be able to help there too). And then libraries like Subxt would just lean on this code rather than using its own logic, and Niklas also suggested that such hashes could be a part of eg polkadot changelog or whatever if we wanted, so that people could compare and see at a high level what's changed shape between updates.

Does that sound like an approach worth pursuing here?

niklasad1 commented 1 year ago

Niklas also suggested that such hashes could be a part of eg polkadot changelog or whatever if we wanted, so that people could compare and see at a high level what's changed shape between updates

Yeah, my point is that it's not shown by the CHANGELOG in polkadot whether a pallet had breaking changes because the pallets are not following semver versioning and then metadata hash per pallet could help with that.

It would help us to detect breaking changes in tools such as staking-miner-v2 and help others as well I reckon. Not sure how others deal with it but there are also subwasm which can be used but would much more user-friendly to have a list of the breaking changes in the CHANGELOG.

bkchr commented 1 year ago

I still imagine that we put metadata into a merkle tree. We would need to find out which information we want to have part of this tree or if we just dump the entire metadata there. The root of this tree can be used to put into the wasm file as "metadata-root". This metadata root will then later be used to ensure that the online wallet has shown the correct data to the user, before giving the info to the offline signer as the offline signer would include the metadata root in the tx. Then the runtime can just reject extrinsics that use some unknown metadata root.

When the online wallet signes create some extrinsic, it would need to have some kind of "recorder" running that records all the accesses to the metadata tree. This recording is then used as proof and send to the offline signer. The offline signer would use this proof to decode the extrinsic data and show it to the user. The root of this proof is the metadata root that is then included in the extrinsic.

All of this requires some experimentation and work with some wallets to find out what metadata info they need, if we can build such a proof. How big would be such a proof? Would the proof be small enough to always send the proof to the offline signer? (This would for example remove the requirement to scan the entire metadata from the offline signer, but only worth is if the transmission speed is fast enough to do it always) How would be merkelize the metadata?

Slesarew commented 1 year ago

If offline signer does not show metadata to user (by actually receiving it, using it to decode transaction, then hashing it in its entirety to prove that the same metadata that is confirmed by network consensus was used for display), this feature is useless or harmful - here we assume that online wallet is compromised and could show one transaction to user, while request signing of another.

We need to pass whole metadata to offline signer for this to make sense. We can then cache it (and get rid of messy verifier mechanism as authenticity is checked in trustless manner!) so that we do not need to send it each time, as is reasonable approach to Signer app or whatever it is called now, or actually do the effort to make sending fast enough, as in Kampela project, where memory limitations do not allow storage of many metadata instances simultaneously. Yes, this takes some effort to implement, but otherwise this is false sense of security.

bkchr commented 1 year ago

We need to pass whole metadata to offline signer for this to make sense.

I don't see any reason why we need to pass the entire metadata, when you want to only be able to decode this one transaction. You would check that the small part of the metadata is correct, by using the "metadata storage proof". You get all the data from the online device, but you include the "metadata root" in the extrinsic and that makes it secure, as the chain will reject the transaction if this root doesn't match.

where memory limitations do not allow storage of many metadata instances simultaneously

Which is even more an argument for not needing to send the entire metadata.

Slesarew commented 1 year ago

But then if you do any checks of this kind in hot device, you could have your metadata tampered with.

I think the solution could be if there is a unified protocol for extraction of metadata subset required for decoding that could be repeated by both hot device and chain nodes; we would not need to reference to tree root in this case, but we would have a proof that hot device did its job honestly. Can we pre-define all branches of this structure ahead of time? I doubt it, considering at least that we have batch calls that make metadata addressing quite arbitrary.

This could save us a lot of headache actually.

But any protocol that relies on honesty of hot device in metadata chopping without actually checking it would break security.

bkchr commented 1 year ago

But then if you do any checks of this kind in hot device, you could have your metadata tampered with.

I did not spoke about any checks on the hot device. All the things I said above are secure and would ensure that the cold device used the correct metadata to decode the extrinsic. As otherwise the chain would reject the extrinsic if this didn't contained the correct metadata root (which declares the metadata and its content). This is basically the same as you secure the storage of each block.

Can we pre-define all branches of this structure ahead of time? I doubt it, considering at least that we have batch calls that make metadata addressing quite arbitrary.

We don't need to pre-define the branches. The hot device would record the branches it use to send these to the cold device.

But any protocol that relies on honesty of hot device in metadata chopping without actually checking it would break security.

This doesn't rely on the hot device to be honest. The metadata tree would be pre-defined by the metadata root and the chain knows this root. Meaning if the hot device would omit something, the root wouldn't match. Just read into storage proofs and how they work, as this is exactly the same mechanism here.

Slesarew commented 1 year ago

Ok, then I probably mistunderstood, this sounds good. Any ETA on this? This is really important question that would influence Kampela release time.

bkchr commented 1 year ago

No ETA, sadly. I would like to have this better yesterday to remove the need for this metadata portal.

All of this requires some experimentation and work with some wallets to find out what metadata info they need, if we can build such a proof. How big would be such a proof? Would the proof be small enough to always send the proof to the offline signer? (This would for example remove the requirement to scan the entire metadata from the offline signer, but only worth is if the transmission speed is fast enough to do it always) How would be merkelize the metadata?

It requires someone doing what I have outlined here.

Polkadot-Forum commented 1 year ago

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/faq-ledger-apps-a-safe-generic-app-design/3150/6

lucasvo commented 1 year ago

I wanted to write a quick update on a conversation that happened on matrix (in Kampela's channel: #Kampela:matrix.zymologia.fi) and say that there's some momentum to start experimenting and implementing this. It would be great to get a technical contact on each wallet/signing team (@jacogr, Nova, Talisman, Polkadot Vault team) to jump in here on this thread so that any development can be coordinated and this migration can be done quickly.

Polkadot-Forum commented 1 year ago

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/the-ledger-app-debate-united-we-stand-divided-we-fall/3177/8

rphmeier commented 1 year ago

I still imagine that we put metadata into a merkle tree. We would need to find out which information we want to have part of this tree or if we just dump the entire metadata there. The root of this tree can be used to put into the wasm file as "metadata-root". This metadata root will then later be used to ensure that the online wallet has shown the correct data to the user, before giving the info to the offline signer as the offline signer would include the metadata root in the tx. Then the runtime can just reject extrinsics that use some unknown metadata root.

@jsdw Have you seen this?

The first step we need is to figure out the proper format for merklizing the metadata. Just to start the discussion, how about computing the metadata root hash based on this following key-value mapping pallet_index ++ "C" ++ call_index => call metadata

We'd also want the merkle trie to contain some relevant subset of the type registry.

Any thoughts? Is this being discussed anywhere else?

Slesarew commented 1 year ago

@rphmeier https://polkadot.polkassembly.io/post/1830 - yes it is, we'll submit an actionable proposal soon.

The trick is to trim the trie properly to avoid messy things like batch_call collecting whole metadata and future-proof the whole thing from inflating too much. Merkle is actually not the only promising solution, we plan to dig into it soon.

rphmeier commented 1 year ago

@Slesarew Excellent! I will take a look over it

May I suggest opening an RFC at https://github.com/polkadot-fellows/RFCs with the proposal when ready?

bkchr commented 1 year ago

I have already started on some format. Basically I'm thinking about using some kind of binary tree where we keep the key outside of the trie. Basically we would collect all the leaf nodes in some BTreeMap with Key => Node. Then we would collect these into a binary tree. The tree would be automatically balanced in this case. The hash of the leaf node is the H(Key, Data).

The data of the leaf node then depends on what we are encoding. Aka more high level types or more low level types. We will need some kind of compression. A call that for example only is composed of u32, u32, u32 should be compressed into one leaf node. Basically we should check if a type id is reused somewhere and if not, we can merge it into the type. Fundamental types are always suited to be merged as they can be represent using an u8/enum.

The root node should probably be ExtrinsicMetadata plus the hashes of the childs. The extrinsic metadata is the basic information that we will always require. So, it should make sense to put it into the root.

Stuff like docs should be optional. By optional I mean they are in their own leaf node, so that certain implementations can fetch it and others can leave it out (like ledger).

These are some rough ideas that we can look into. Especially we need to find out how the proof size scales.

rphmeier commented 1 year ago

We may also save space by using other kinds of cryptographic accumulators (e.g. RSA accumulators). Merkle tries are not the most space efficient accumulators, but are typically used in blockchain settings only because of the ease of update, removal in particular. Since the metadata accumulator needs to only be produced ahead of time (when producing the new runtime blob), we should be able to avoid doing any merkle trie traversals in the signing hardware.

I assume this is what @Slesarew meant by

Merkle is actually not the only promising solution, we plan to dig into it soon.

Some more details on

The trick is to trim the trie properly to avoid messy things like batch_call

would be interesting too. I suppose to avoid duplicating type-encoding information across many calls? This seems doable.

Slesarew commented 1 year ago

Yes, I meant accumulators. Something like this gives some general idea.

I suppose to avoid duplicating type-encoding information across many calls?

That is really not the trickiest part; batch_call encodes all calls as its enum variants; to fit into Ledger, we'll have to trim enums within calls. I don't see why we would run into any serious problems here, but it would have to be done carefully. There are many other calls with quite large variant sets, not as large of course, but still not fitting well in small spaces.

bkchr commented 1 year ago

That is really not the trickiest part; batch_call encodes all calls as its enum variants; to fit into Ledger, we'll have to trim enums within calls.

I don't get the problem. The argument to batch_call is basically the same information that you find in the ExtrinsicMetadata that I linked above. And for sure we don't want to put the entire Call enum into one blob, more like the outer enum with all its variants and the sub calls being in one blob. But each variant of the sub call (aka pallet call) are linking to a different blob that contains these information. So, for batch_call it will look like this:

[Call(.., Utility(ID(1), ..), ..), ID(0)], [Batch_All(Vec(ID(0)), ID(1)]

The type with ID(0) is the outer enum and the type with ID(1) is the batch_all variant. Stuff like Vec should be made a primitive type as well. This should safe us quite some space/indirection.

We may also save space by using other kinds of cryptographic accumulators (e.g. RSA accumulators). Merkle tries are not the most space efficient accumulators

Yeah that sounds reasonable. In the end the accumulator should not be that important? And we can just see what gives the best results for us.

rphmeier commented 1 year ago

In the end the accumulator should not be that important?

It probably is, if we are talking about highly memory-constrained devices such as the Nano S. We could chew through that much memory pretty quickly, unless the implementation does a streaming traverse and discards the path.

Though RSA accumulators could easily require more memory, and bilinear accumulators require some trusted setup.

bkchr commented 1 year ago

I expressed myself badly. By "should not be that important" I meant for now where we first need to figure out the data structures to express the metadata in small pieces. The type of accumulator should then be almost "data structure agnostic" and just also needs to be chosen based on some criteria like small memory usage etc.

jsdw commented 1 year ago

The root node should probably be ExtrinsicMetadata plus the hashes of the childs.

Just a small thing; given that the payload that needs signing is call_data + signed_extra_params + additional_params, should the metadata for the extra/additional stuff (https://github.com/paritytech/frame-metadata/blob/80ca5c2170928035e2adec9d2942a719086130fa/frame-metadata/src/v14.rs#L113) also be a part of the thing that's hashed to ensure tip/mortality etc align?

bkchr commented 1 year ago

That is already part of the ExtrinsicMetadata, so it will be included.

richcsy commented 1 year ago

This conversation is continuing on the milestone level in the project created for it: Milestone 1.1: Publish refined requirements for protocol paritytech/substrate#1

Worth noting that the timeline view is provisional as there are unknowns which will be addressed as the project progresses. You can check it out here.

kianenigma commented 5 months ago

I belive this, and all related issues can be closed now as https://github.com/polkadot-fellows/RFCs/blob/bkchr-merkelized-metadata/text/0078-merkleized-metadata.md is formalized.