polkadot-fellows / RFCs

Proposals for change to standards administered by the Fellowship.
https://polkadot-fellows.github.io/RFCs/
Creative Commons Zero v1.0 Universal
111 stars 51 forks source link

Metadata for offline signers #46

Closed Slesarew closed 4 months ago

Slesarew commented 8 months ago

Add a digest of metadata to signed extensions as mentioned here https://github.com/paritytech/polkadot-sdk/issues/2016

carlosala commented 8 months ago

It'd be great to define how lemmas and indices are sorted when computing ShortMetadata. Then, we could implement other shorteners (and proof-generator) in any language with the same output. Then, every signer could modify that and transmit it to the device as needed, not caring about who created the hashes. With the current implementation, indices are sorted using its associated entry blake3 hash (comparing byte per byte), and lemmas are sorted by their index in descending order. I propose to sort indices in ascending order (matching the registry order).

Slesarew commented 8 months ago

It'd be great to define how lemmas and indices are sorted when computing ShortMetadata.

Isn't it outside of RFC scope? This is not related to chain behaviour, it's purely on cold protocol side. We can spec it out elsewhere.

carlosala commented 8 months ago

Isn't it outside of RFC scope? This is not related to chain behaviour, it's purely on cold protocol side. We can spec it out elsewhere.

Yep, it's probably outside of the chain's scope. Let's keep it in mind, it's actually important to have a standardized way of transmitting data.

carlosala commented 8 months ago

@Slesarew still missing the changes proposed above!

xlc commented 7 months ago

To confirm my understanding is correct: An algorithm is described to chunk FRAME metadata and generate a binary merkle tree from it. This metadata merkle tree root and spec version, spec name, and other information are hashed and used to create a versioned digest. The extrinsic format is updated to include this versioned digest to be part of the signing payload. The signer is expected to receive a small metadata chunks to decode extrinsic. It will use the received chunks and merkle proof to reconstruct the versioned digest and use it in the signing payload. The runtime is expected to verify the signing payload is valid.

tomaka commented 7 months ago

I've now re-read this RFC multiple times. I know what a trie is, I know what a Merkle proof is and the details of the format of a Merkle proof in Substrate, and I know what the metadata is and what is found inside. Yet as I've already commented, I don't see how one could possibly implement either generating or parsing this compact metadata by reading this RFC alone.

This RFC uses a lot of terminology that I have no idea what they mean (either generally or in this context), such as "token unit String", "modularization", "merging protocol", or "shortening". All these terms should really be introduced or explained.

In the middle of the RFC there's suddenly the word "leaves", but I don't think I've seen where it ever introduces the fact that there's a trie. I understand that the metadata is split in a trie, but the reason why I understand that is because I've followed in the past the development of this feature, because I don't think that's even mentioned in the RFC.

It would also have been nice to give a brief overview of what the metadata contains, rather than assume that the reader is familiar with that, because I don't think that there is a single document that can be found online that describes the metadata in details. This would make it possible to review what is left and what is stripped.

It seems that this RFC is just a collection of notes that were written down while the implementation was taking place, rather than an actual presentation and explanation of the design. It seems to assume that the reader is already familiar with the implementation, and in general makes no effort to be accessible.

It's consequently almost impossible for me to properly review this.

bkchr commented 7 months ago

I support what @tomaka is saying. IMO it requires much more information, like the actual data structures in pseudo code etc.

It would also have been nice to give a brief overview of what the metadata contains, rather than assume that the reader is familiar with that, because I don't think that there is a single document that can be found online that describes the metadata in details. This would make it possible to review what is left and what is stripped.

I would say that the actual metadata is out of scope. However, we should explain what it is as Pierre said and then explain the structures that are put into the trie etc.

Slesarew commented 6 months ago

I still miss a real explanation of the types that are part of the shortened metadata, as they are crucial for the protocol being described here.

Just realized this was the main question. The types are exactly same as they are defined in frame-metadata, so that we don't have to re-implement every single decoding tool. The only difference is that docs are left blank.

bkchr commented 6 months ago

Just realized this was the main question. The types are exactly same as they are defined in frame-metadata, so that we don't have to re-implement every single decoding tool. The only difference is that docs are left blank.

But the types in frame-metadata probably change per version. There should be a fixed representation of the types being used in this RFC.

Slesarew commented 6 months ago

Just realized this was the main question. The types are exactly same as they are defined in frame-metadata, so that we don't have to re-implement every single decoding tool. The only difference is that docs are left blank.

But the types in frame-metadata probably change per version. There should be a fixed representation of the types being used in this RFC.

Ok, I've mentioned that we imitate MetadataV14 and also added the structure.

bkchr commented 6 months ago

I already posted this yesterday internally in some chat, I also want to make it public here. I'm still not convinced that you can implement this Rfc by reading the Rfc. To actually implementing you will need to lookup all the types that are not mentioned here in the scale codec crate. I would personally also layout a more customized format for the types that would for example remove the docs field everywhere. Currently the docs are emptied everywhere, but it means that there is everytime a byte being used for each encoded type information.

Slesarew commented 6 months ago

I already posted this yesterday internally in some chat, I also want to make it public here. I'm still not convinced that you can implement this Rfc by reading the Rfc. To actually implementing you will need to lookup all the types that are not mentioned here in the scale codec crate.

Do you suggest we just place whole frame-metadata documentation here? We are trying to reuse the types, as was mentioned many times, as this way we reuse old decoding code and speed up adoption.

I would personally also layout a more customized format for the types that would for example remove the docs field everywhere. Currently the docs are emptied everywhere, but it means that there is everytime a byte being used for each encoded type information.

and this would exactly break that idea for - what? It's typical 20 chunks, at most 50 for heavy XCM calls - so we'll win 50 bytes out of few kb of chunks and proof data, that are not even close to the chain - that are just transferred to cold device and consumed there, with auxiliary data, erasure overheads, and error correction codes (or flash momentarily at build time of runtime to generate digest constant).

Slesarew commented 6 months ago

We've also discussed this privately and would change the format slightly to better adhere to metadata V15 soon. This would remain backward-compatible with V14, as well as non-shortened metadata itself. I'll commit the update soon.

archtuxfan commented 4 months ago

No update since Jan 30?

Slesarew commented 4 months ago

No update since Jan 30?

My fellow archer, I've been informed that the Fellowship intends to re-do this work internally based on this proposal; thus this (which is certainly a finished text as far as I can tell) should be superseded now by another PR. The latter is supposed to be voted on and merged quite soon as far as I understand, so that's good news. I think this PR would just be closed in case of success afterwards.

archtuxfan commented 4 months ago

No update since Jan 30?

My fellow archer, I've been informed that the Fellowship intends to re-do this work internally based on this proposal; thus this (which is certainly a finished text as far as I can tell) should be superseded now by another PR. The latter is supposed to be voted on and merged quite soon as far as I understand, so that's good news. I think this PR would just be closed in case of success afterwards.

My understanding is that this needs to pass before the ledger app gets released. Wondering if anyone has a sense on the ETA?

ggwpez commented 4 months ago

Isnt this superseded by https://github.com/polkadot-fellows/RFCs/pull/78?

bkchr commented 4 months ago

Yes.