Open tomaka opened 2 years ago
Relevant comment in smoldot: https://github.com/paritytech/smoldot/blob/7f6c36960c5524a85fb218f039c7c67126a58ddd/src/author/runtime.rs#L720-L726
As we already had this topic multiple times, the Substrate extrinsic format is already prefixed with the length, aka like a Vec<u8>
.
If you just use the opaque extrinsic, you can decode this without any problems. Other locations like apply_extrinsic
also use the same semantic.
the Substrate extrinsic format is already prefixed with the length
No, the Polkadot, Kusama, Westend and Rococo runtimes have an extrinsic format prefixed with their length, but this is a leak in the abstraction layers that isn't guaranteed to be true and relies literally on one line in each runtime.
This is no leak in abstraction layers. We have opaque types for the client and the normal types in the runtime. You just need to know the type that is being used.
And you are right, that everybody could change this in its Substrate based blockchain, but that is fine because you then need to specify this type.
Shouldn't we try to remove usage of runtime-specific types in the client, so that the runtimes can properly update without the client needing an update as well?
Right now if we changed type Extrinsic = ;
to a different type, all non-up-to-date clients would break.
I just don't understand. We know it's a flaw, we know that it can backfire at the worst possible moment, we know it's easy to fix, but we intentionally don't fix it.
Shouldn't we try to remove usage of runtime-specific types in the client, so that the runtimes can properly update without the client needing an update as well? Right now if we changed
type Extrinsic = ;
to a different type, all non-up-to-date clients would break.
The transaction encoding, as the hash type or block number are things that you need to choose before and can not easily change, aka forking or writing complex migrations.
The transaction format is already chosen in a way that it is prefixed with the length of the tx, aka it is already scale encoded. This is exactly the same discussion as we had multiple times regarding transaction pool, because there we use the same format. If we would change here the return type to a new format, we would need to update the db, BlockBuilderApi
, TransactionPoolApi
and everything else that uses the transaction type.
This is exactly the same discussion as we had multiple times regarding transaction pool, because there we use the same format
This is not the same discussion.
The return value of inherent_extrinsics()
is the only single place where a client needs to know how to decode an extrinsic.
Everything else, be it the networking, the transactions pool, JSON-RPC, etc. can be implemented by passing around encoded extrinsics in an opaque way.
But you can't possible parse the value returned by inherent_extrinsics()
without having to decode extrinsics.
If we would change here the return type to a new format, we would need to update the db, BlockBuilderApi, TransactionPoolApi and everything else that uses the transaction type.
No, all I'm saying is that inherent_extrinsics()
alone should return the encoded extrinsics. Then the client can decode them if it really wants to. Nothing else needs to be changed.
It's purely about being able to understand the value that inherent_extrinsics()
writes in the memory of the Wasm.
What is with https://github.com/paritytech/substrate/blob/master/client/network/src/transactions.rs#L353-L359
The transaction pool's API uses decoded extrinsics, but we could easily refactor it to keep them in their opaque encoded form. Nothing in the client requires knowing what an extrinsic actually is. It's only the runtime that ever interprets extrinsics.
But these lines in the transaction pool gossipping protocol uses Vec<Extrinsic>
, aka it needs to know the format of Extrinsic
to decode this vector.
Nothing in the client requires knowing what an extrinsic actually is. It's only the runtime that ever interprets extrinsics.
Yes, that is right. However, if we want to change this we need to refactor quite a lot of code.
But these lines in the transaction pool gossipping protocol uses Vec
, aka it needs to know the format of Extrinsic to decode this vector.
Ah, right. In the past I changed the protocol in order to send transactions one by one (so the length of the vec would always be 1), which is better for the substreams multiplexing we're doing, but this has been changed back at some point (not by me).
The
inherent_extrinsics
function returns aVec<Extrinsic>
.The SCALE encoding of a
Vec<Extrinsic>
consists in the number of elements followed by each element one by one. This means that if you don't know how to decode anExtrinsic
, you can't find, within the encoding of aVec<Extrinsic>
, where each extrinsic ends and the next one starts, makingVec<Extrinsic>
undecodable.In order to more de-couple the client from the runtime, I would suggest that
inherent_extrinsics
returns some sort ofVec<EncodedExtrinsic>
whereEncodedExtrinsic
wraps around aVec<u8>
and represents a SCALE-encoded extrinsic.