Closed tomaka closed 9 months ago
I've found a pretty big issue with this proposal, which is that the storageTrieRootHash
value that can be queried can't be trusted, and thus that the genesis hash of the chain can't be trusted.
I initially just wanted to add a note in the RFC saying that the calculated genesis hash must not be used for purposes other than determining the networking protocol names.
Unfortunately, the JSON-RPC API has functions to query the genesis hash, and so if a light client connects to a parachain in a chain-spec-less way (through just a relay chain and para_id
), it has no way to implement the JSON-RPC API properly.
As a side note, if you connect in a chain-spec-less way the JSON-RPC API can't be fully implemented anyway, as the chainSpec_unstable_properties and chainSpec_unstable_chainName functions can't be implemented. But that's already covered by the JSON-RPC spec, which allows some namespaces to not be implemented. In this case, you wouldn't implement the functions in the chainSpec
namespace.
We can maybe simply remove the JSON-RPC functions that allow querying the genesis hash, except that when you submit a transaction you need to include the genesis hash in it.
I can see two ways of solving this problem, I'm not sure right now which way is best:
Give up for now on the idea of connecting to a parachain without a chain spec. Remove the stateTrieRootHash
and forkId
field from the new networking protocol, as they would be loaded from the chain spec. The RFC would simply make it possible to find the bootnodes of a parachain, but not to connect to said parachain without a chain spec.
Double down on the idea of connecting to a parachain without a chain spec, by removing the JSON-RPC functions that allow knowing the genesis hash, and instead JSON-RPC clients should read System::BlockHash[0]
(or alternatively introduce a new item in the runtime) in order to know the genesis hash. Given that the runtime verifies the genesis hash when submitting a transaction, we know that this genesis hash is always available in storage and just needs to be queried in a way or the other.
I ended up going for the first option, but I have also opened https://github.com/paritytech/json-rpc-interface-spec/pull/61 in order to make it future-proof.
Give up for now on the idea of connecting to a parachain without a chain spec. Remove the stateTrieRootHash and forkId field from the new networking protocol, as they would be loaded from the chain spec. The RFC would simply make it possible to find the bootnodes of a parachain, but not to connect to said parachain without a chain spec.
It seems that light clients will always need some form of chain spec, but that it only needs a few piecse of information:
Given the weakly synchronous model it would also make sense for light clients to have additional information such as a recent (~1 month) finalized block hash, for maximum deterrence of long range attacks.
It seems that light clients will always need some form of chain spec, but that it only needs a few piecse of information:
Para-ID Genesis Hash Fork ID
Also needed is the properties
field (which PolkadotJS uses to know the token name and number of decimals (I've always thought that this should be moved on chain)).
Given the weakly synchronous model it would also make sense for light clients to have additional information such as a recent (~1 month) finalized block hash, for maximum deterrence of long range attacks.
I don't understand why? Isn't it enough to look into the relay chain which parablock is finalized?
Properties could definitely be placed in on-chain metadata. Para ID could too, no?
Para ID could too, no?
The light client looks into the relay chain (and thus needs to know the para ID) in order to know which of the parachain blocks is canonical. But if we put the para ID on chain, then in order to read the para ID from the storage, the light client first needs to known which parachain block is canonical. In other words it's a chicken and egg problem.
I don't understand why? Isn't it enough to look into the relay chain which parablock is finalized?
Technically, validator sets can "go back in time" after their stake is withdrawn and create an alternative chain. This puts an expiration date on finality's trustworthiness, and is the case in all PoS blockchains.
Most of the time this is not a problem in practice, but in theory the correct thing to do is make sure that the node has some reference block for finality (which it either gets by synchronizing at least once every month, or has hardcoded in its chain spec).
Earlier I said it'd be a parachain block hash, but it could just as well be a relay chain block hash to use as a starting point for finality.
I suppose we can agree that light clients need some minimal chain spec?
Earlier I said it'd be a parachain block hash, but it could just as well be a relay chain block hash to use as a starting point for finality.
That's what I'm currently going for in smoldot. It seems easier to me to update the relay chain specification with an up-to-date block from time to time, rather than update the chain specs of every parachains.
Substrate-connect hardcodes the 4 relay chains (Polkadot, Kusama, Westend, Rococo), and is meant to publish a new version at a regular interval containing an up-to-date chain spec for these four relay chains (although the publishing doesn't always happen in practice because it's currently a semi-manual process and the maintainers forget). With this solution, dApps/uApps don't need to deal with this problem.
I suppose we can agree that light clients need some minimal chain spec?
* Para ID * Genesis Hash * Relay Chain Spec * Properties (could be moved on-chain for later) * Finalized relay-chain block hash (nice-to-have)
The finalized relay chain block hash would be contained the relay chain spec, and you're missing the forkId
, but otherwise yes.
Could we not store the genesis_hash
in the response
as well? We only need it to connect to the networking protocols. The genesis_hash
is case of FRAME
is stored on chain and we can fetch it from there and ensure that it is the one we got in the initial response
.
Could we not store the
genesis_hash
in theresponse
as well? We only need it to connect to the networking protocols. Thegenesis_hash
is case ofFRAME
is stored on chain and we can fetch it from there and ensure that it is the one we got in the initialresponse
.
What you suggest is correct, but in practice it's very complicated to implement.
The problem is: what to do if multiple parachain bootnodes return multiple different genesis_hash
es?
It is in principle possible to write code so that it uses a separate networking protocol name for each parachain bootnode, which avoids the problem, but that's pretty complicated.
The way I was going with this RFC is to start with a minimal version, and then maybe add the genesis_hash
and fork_id
to the response later, but I'm not super confident in that decision and we could indeed maybe just add them now even if smoldot isn't capable of making use of them just yet.
Also, while the genesis_hash
can be verified at some point, the fork_id
can't.
This means that it's possible for multiple different parachain bootnodes to return multiple different fork_id
s, and to all be successfully reachable through their own individual fork_id
.
This complicates the code, but isn't a problem by itself. What is a problem, however, is which fork_id
do you use when you discover the other parachain nodes through Kademlia? The answer is obviously to use the same fork_id
as the node that was queried, but since the same node can be discovered multiple times, that means that the same node can have multiple different fork_id
s, and it basically becomes a mess.
I've pushed another commit that adds back genesis_hash
and fork_id
, as the cost of adding them seems minimal to me, and adding them now removes the cost of going through another RFC and implementation+upgrade phase in the future (in particular, parachain builders can be pretty slow to upgrade their Substrate version).
I would like to submit this for voting, but there's still the unresolved question of what to do when it comes to obtaining randomness. Should we use the BabeApi_currentEpoch
and BabeApi_nextEpoch
runtime APIs (and later their Sassafras equivalents), as the RFC proposes? Or should it be separate new functions?
I would 'aye' this in its current state
/rfc-propose
Hey @tomaka, here is a link you can use to create the referendum aiming to approve this RFC number 0008.
It is based on commit hash d8298f6dbd96da490047e27634e89cc2a87f7251.
The proposed remark text is: RFC_APPROVE(0008,6978a0b277edba30e5c5a8d955554d842eede08cb180fea640dc7bf9b46f982c)
.
I can't seem to be able to create a referendum ("bad origin" error), guessing it's because I'm not in the fellowship.
I wouldn't mind some help or guidance with this administrative process
Just wanted to repeat that I can't find any proper documentation or explanation about how to proceed, and this PR will be stuck for as long as nobody helps me with this bureaucracy.
/rfc help
The RFC action aims to help with the creation of on-chain RFC referenda and with handling the RFC PRs.
The main commands are /rfc propose
and /rfc process
.
See usage instructions.
/rfc process 0x8bfe18c1c9c892515872a551b9f2f71216a522e2bc1a81e26a2d03e167208aa4
Unable to find the referendum confirm event in the given block.
/rfc process 0x56bfa1b49e69991617d8aed0c880101f5506e0595b06a6b97b9d5a6cc014b264
@tomaka Handling the RFC command failed :( You can open an issue here.
/rfc process 0x56bfa1b49e69991617d8aed0c880101f5506e0595b06a6b97b9d5a6cc014b264
@rzadp Handling the RFC command failed :( You can open an issue here.
/rfc process 0x56bfa1b49e69991617d8aed0c880101f5506e0595b06a6b97b9d5a6cc014b264
The on-chain referendum has approved the RFC, however I was not able to merge the PR automatically.
The on-chain referendum has approved the RFC, however I was not able to merge the PR automatically.
The bot seems to be missing permissions to merge the PR, although I cannot find out what exactly is missing and what needs to be changed. There were no changes to the bot since the last time it merged a PR.
/rfc process 0x56bfa1b49e69991617d8aed0c880101f5506e0595b06a6b97b9d5a6cc014b264
The on-chain referendum has approved the RFC, however I was not able to merge the PR automatically.
/rfc process 0x56bfa1b49e69991617d8aed0c880101f5506e0595b06a6b97b9d5a6cc014b264
/rfc process 0x56bfa1b49e69991617d8aed0c880101f5506e0595b06a6b97b9d5a6cc014b264
The on-chain referendum has approved the RFC, however I was not able to merge the PR automatically.
/rfc process 0x56bfa1b49e69991617d8aed0c880101f5506e0595b06a6b97b9d5a6cc014b264
The on-chain referendum has approved the RFC, however I was not able to merge the PR automatically.
The on-chain referendum has approved the RFC, however I was not able to merge the PR automatically.
/rfc process 0x39fbc57d047c71f553aa42824599a7686aea5c9aab4111f6b836d35d3d058162
Unable to find the referendum confirm event in the given block.
/rfc process 0x56bfa1b49e69991617d8aed0c880101f5506e0595b06a6b97b9d5a6cc014b264
The on-chain referendum has approved the RFC.
Rendered.