paritytech / substrate

Substrate: The platform for blockchain innovators
Apache License 2.0
8.39k stars 2.65k forks source link

Extrinsic ID (hash) of currently executing extrinsic should be accesible in the runtime #5153

Open Sushisource opened 4 years ago

Sushisource commented 4 years ago

Someone submitting an extrinsic can easily obtain the id/hash of that extrinsic before submitting it to a validator by hashing it. (Side note: Why isn't this exposed as .hash() on UncheckedExtrinsic while it is on Block and Header?)

However, from within the runtime, one cannot easily determine the hash of the extrinsic currently being processed. This is important if, for example, some external system refers to extrinsics by these IDs. In our case, some extrinsics create a pending state, which is later resolved by another extrinsic. We could create our own identifiers for this purpose (and is what we do currently), but given extrinsics already have a unique ID it would be desirable to re-use this. However, I can't create a map of <extrinsic id, data> in state without being able to access the ID, hence the need.

If there's some workaround for getting this now, I would love to know. Alternatively, if it can't be done, I'd be happy to look into making a PR for the functionality.

tomusdrw commented 4 years ago

Extrinsics can be encoded and hashed within the runtime if needed, we are encoding them already here: https://github.com/paritytech/substrate/blob/3ca7ccef985407105d525b3ca2d1a79a820f3318/frame/executive/src/lib.rs#L280

Indeed from a pallet perspective, there is no way to tell - on that level we are operating on Call already rather than transactions.

Note also that hashes of extrinsics are not necessarily unique - there are no assumptions in the entire client code that the same extrinsic cannot be included twice (it's up to the runtime to decide). Even with FRAME when the account drops below existential deposit some extrinsics can be re-executed after the account is topped up (since on account reaping the nonce will be reset). So overall I'd be careful with relying on generic extrinsic hash as a unique identifier. That said we use that in transaction pool, as at the same time there can only be one such transaction.

For in-block extrinsics I think (blockhash, extrinsic index) is a better identifier or addressing scheme.

Sushisource commented 4 years ago

@tomusdrw In this case we would explicitly be preventing duplicates. And, unfortunately, the other suggested identifier cannot work because we have the need for the user submitting the transaction to be able to know its id before it is submitted (hence neither blockhash or index may be known).

If I made a PR that piped the encoded extrinsic such that it was available within a pallet, would that be a change that would be welcome?

Tangential question: Doesn't the account index and the pubkey being part of the extrinsic guarantee uniqueness? I'm just curious about the possibility of non-uniqueness among extrinsic IDs you mentioned, seems like something that probably should be mentioned in https://substrate.dev/docs/en/conceptual/node/extrinsics

gavofyork commented 4 years ago

@tomusdrw In this case we would explicitly be preventing duplicates. And, unfortunately, the other suggested identifier cannot work because we have the need for the user submitting the transaction to be able to know its id before it is submitted (hence neither blockhash or index may be known).

If I made a PR that piped the encoded extrinsic such that it was available within a pallet, would that be a change that would be welcome?

Tangential question: Doesn't the account index and the pubkey being part of the extrinsic guarantee uniqueness? I'm just curious about the possibility of non-uniqueness among extrinsic IDs you mentioned, seems like something that probably should be mentioned in https://substrate.dev/docs/en/conceptual/node/extrinsics

No. As @tomusdrw said, accounts whose balances drop too low can be reaped, reseting the account index.

And no the change isn't especially welcome. Extrinsics can be huge (some are 1.5MB +). Casually copying them around into new bits of storage is not a change to be made lightly given the resource-constrained environment. This is especially true if you (a) only care about getting the hash anyway (it could be determined and passed in prior to decode, or, at least, calculated while the extrinsic is still in scope, reducing the need to copy anything into temporary storage); and (b) it's not at all clear that the hash would be useful for your needs given that it is not guaranteed to be unique.

If you need a unique ID for some specific dispatch or on-chain operation, then the extrinsic hash is NOT what you are looking for. Even if extrinsic hashes were unique, there are many other ways to dispatch a call within the same extrinsic execution. Utility.batch would be one way, smart-contract execution would be another. There could be many more as the world of pallets are expanded.

If you need a unique ID, it's best to make it yourself.

Sushisource commented 4 years ago

@gavofyork Thanks for the info. I understand the non-uniqueness claim now (although we have reaping handled differently in our system so I'm not sure it applies, but I'd still rather follow best practices).

A bigger question I'm curious about here is what substrate implementers are expected to use for a unique transaction identifier. Is the expectation simply that, if you have a need for one, you should create your own scheme? Is there a more canonical way to obtain a unique transaction identifier? It's not really that I'm looking to identify a specific operation, I'm just looking to identify the initiating extrinsic/transaction. IE: I don't care about tracking Calls, but I do care about tracking Extrinsics - namely a unique ID per "thing that was submitted to the chain", which is my understanding of an extrinsic.

This rings as pretty important to me to call out in the docs. We proceeded under the assumption that the extrinsic hash was unique as this is lightly implied in a handful of places. For example in this rpc api here -- or in polkascan lookups which link by extrinsic hash: https://polkascan.io/pre/kusama/extrinsic/0x7cd7c1b59575c0a0920cd748dccd48d667933a29c33c7b052c6038fa6276bf21

Thanks again!