Store parachain bootnodes in relay chain DHT

tomaka commented 1 year ago

I've found a pretty big issue with this proposal, which is that the storageTrieRootHash value that can be queried can't be trusted, and thus that the genesis hash of the chain can't be trusted.

I initially just wanted to add a note in the RFC saying that the calculated genesis hash must not be used for purposes other than determining the networking protocol names. Unfortunately, the JSON-RPC API has functions to query the genesis hash, and so if a light client connects to a parachain in a chain-spec-less way (through just a relay chain and para_id), it has no way to implement the JSON-RPC API properly.

As a side note, if you connect in a chain-spec-less way the JSON-RPC API can't be fully implemented anyway, as the chainSpec_unstable_properties and chainSpec_unstable_chainName functions can't be implemented. But that's already covered by the JSON-RPC spec, which allows some namespaces to not be implemented. In this case, you wouldn't implement the functions in the chainSpec namespace.

We can maybe simply remove the JSON-RPC functions that allow querying the genesis hash, except that when you submit a transaction you need to include the genesis hash in it.

I can see two ways of solving this problem, I'm not sure right now which way is best:

Give up for now on the idea of connecting to a parachain without a chain spec. Remove the stateTrieRootHash and forkId field from the new networking protocol, as they would be loaded from the chain spec. The RFC would simply make it possible to find the bootnodes of a parachain, but not to connect to said parachain without a chain spec.
Double down on the idea of connecting to a parachain without a chain spec, by removing the JSON-RPC functions that allow knowing the genesis hash, and instead JSON-RPC clients should read System::BlockHash[0] (or alternatively introduce a new item in the runtime) in order to know the genesis hash. Given that the runtime verifies the genesis hash when submitting a transaction, we know that this genesis hash is always available in storage and just needs to be queried in a way or the other.

tomaka commented 1 year ago

I ended up going for the first option, but I have also opened https://github.com/paritytech/json-rpc-interface-spec/pull/61 in order to make it future-proof.

rphmeier commented 1 year ago

Give up for now on the idea of connecting to a parachain without a chain spec. Remove the stateTrieRootHash and forkId field from the new networking protocol, as they would be loaded from the chain spec. The RFC would simply make it possible to find the bootnodes of a parachain, but not to connect to said parachain without a chain spec.

It seems that light clients will always need some form of chain spec, but that it only needs a few piecse of information:

Para-ID
Genesis Hash
Fork ID

Given the weakly synchronous model it would also make sense for light clients to have additional information such as a recent (~1 month) finalized block hash, for maximum deterrence of long range attacks.

tomaka commented 1 year ago

It seems that light clients will always need some form of chain spec, but that it only needs a few piecse of information:

Para-ID Genesis Hash Fork ID

Also needed is the properties field (which PolkadotJS uses to know the token name and number of decimals (I've always thought that this should be moved on chain)).

Given the weakly synchronous model it would also make sense for light clients to have additional information such as a recent (~1 month) finalized block hash, for maximum deterrence of long range attacks.

I don't understand why? Isn't it enough to look into the relay chain which parablock is finalized?

gavofyork commented 1 year ago

Properties could definitely be placed in on-chain metadata. Para ID could too, no?

tomaka commented 1 year ago

Para ID could too, no?

The light client looks into the relay chain (and thus needs to know the para ID) in order to know which of the parachain blocks is canonical. But if we put the para ID on chain, then in order to read the para ID from the storage, the light client first needs to known which parachain block is canonical. In other words it's a chicken and egg problem.

rphmeier commented 1 year ago

I don't understand why? Isn't it enough to look into the relay chain which parablock is finalized?

Technically, validator sets can "go back in time" after their stake is withdrawn and create an alternative chain. This puts an expiration date on finality's trustworthiness, and is the case in all PoS blockchains.

Most of the time this is not a problem in practice, but in theory the correct thing to do is make sure that the node has some reference block for finality (which it either gets by synchronizing at least once every month, or has hardcoded in its chain spec).

Earlier I said it'd be a parachain block hash, but it could just as well be a relay chain block hash to use as a starting point for finality.

I suppose we can agree that light clients need some minimal chain spec?

Para ID
Genesis Hash
Relay Chain Spec
Properties (could be moved on-chain for later)
Finalized relay-chain block hash (nice-to-have)

tomaka commented 12 months ago

Earlier I said it'd be a parachain block hash, but it could just as well be a relay chain block hash to use as a starting point for finality.

That's what I'm currently going for in smoldot. It seems easier to me to update the relay chain specification with an up-to-date block from time to time, rather than update the chain specs of every parachains.

Substrate-connect hardcodes the 4 relay chains (Polkadot, Kusama, Westend, Rococo), and is meant to publish a new version at a regular interval containing an up-to-date chain spec for these four relay chains (although the publishing doesn't always happen in practice because it's currently a semi-manual process and the maintainers forget). With this solution, dApps/uApps don't need to deal with this problem.

I suppose we can agree that light clients need some minimal chain spec?
* Para ID

* Genesis Hash

* Relay Chain Spec

* Properties (could be moved on-chain for later)

* Finalized relay-chain block hash (nice-to-have)

The finalized relay chain block hash would be contained the relay chain spec, and you're missing the forkId, but otherwise yes.

bkchr commented 11 months ago

Could we not store the genesis_hash in the response as well? We only need it to connect to the networking protocols. The genesis_hash is case of FRAME is stored on chain and we can fetch it from there and ensure that it is the one we got in the initial response.

tomaka commented 11 months ago

Could we not store the genesis_hash in the response as well? We only need it to connect to the networking protocols. The genesis_hash is case of FRAME is stored on chain and we can fetch it from there and ensure that it is the one we got in the initial response.

What you suggest is correct, but in practice it's very complicated to implement.

The problem is: what to do if multiple parachain bootnodes return multiple different genesis_hashes? It is in principle possible to write code so that it uses a separate networking protocol name for each parachain bootnode, which avoids the problem, but that's pretty complicated.

The way I was going with this RFC is to start with a minimal version, and then maybe add the genesis_hash and fork_id to the response later, but I'm not super confident in that decision and we could indeed maybe just add them now even if smoldot isn't capable of making use of them just yet.

tomaka commented 11 months ago

Also, while the genesis_hash can be verified at some point, the fork_id can't.

This means that it's possible for multiple different parachain bootnodes to return multiple different fork_ids, and to all be successfully reachable through their own individual fork_id. This complicates the code, but isn't a problem by itself. What is a problem, however, is which fork_id do you use when you discover the other parachain nodes through Kademlia? The answer is obviously to use the same fork_id as the node that was queried, but since the same node can be discovered multiple times, that means that the same node can have multiple different fork_ids, and it basically becomes a mess.

tomaka commented 11 months ago

I've pushed another commit that adds back genesis_hash and fork_id, as the cost of adding them seems minimal to me, and adding them now removes the cost of going through another RFC and implementation+upgrade phase in the future (in particular, parachain builders can be pretty slow to upgrade their Substrate version).

I would like to submit this for voting, but there's still the unresolved question of what to do when it comes to obtaining randomness. Should we use the BabeApi_currentEpoch and BabeApi_nextEpoch runtime APIs (and later their Sassafras equivalents), as the RFC proposes? Or should it be separate new functions?

rphmeier commented 11 months ago

I would 'aye' this in its current state

tomaka commented 10 months ago

/rfc-propose

github-actions[bot] commented 10 months ago

Hey @tomaka, here is a link you can use to create the referendum aiming to approve this RFC number 0008.

Instructions

1. Open the [link](https://polkadot.js.org/apps/?rpc=wss%3A%2F%2Fpolkadot-collectives-rpc.polkadot.io#/extrinsics/decode/0x3d003e01015901000049015246435f415050524f564528303030382c36393738613062323737656462613330653563356138643935353535346438343265656465303863623138306665613634306463376266396234366639383263290100000000). 2. Switch to the `Submission` tab.

3. Adjust the transaction if needed (for example, the proposal Origin). 4. Submit the Transaction

It is based on commit hash d8298f6dbd96da490047e27634e89cc2a87f7251.

The proposed remark text is: RFC_APPROVE(0008,6978a0b277edba30e5c5a8d955554d842eede08cb180fea640dc7bf9b46f982c).

tomaka commented 10 months ago

I can't seem to be able to create a referendum ("bad origin" error), guessing it's because I'm not in the fellowship.

tomaka commented 10 months ago

I wouldn't mind some help or guidance with this administrative process

tomaka commented 10 months ago

Just wanted to repeat that I can't find any proper documentation or explanation about how to proceed, and this PR will be stuck for as long as nobody helps me with this bureaucracy.

tomaka commented 9 months ago

/rfc help

github-actions[bot] commented 9 months ago

The RFC action aims to help with the creation of on-chain RFC referenda and with handling the RFC PRs.

The main commands are /rfc propose and /rfc process.

See usage instructions.

tomaka commented 9 months ago

/rfc process 0x8bfe18c1c9c892515872a551b9f2f71216a522e2bc1a81e26a2d03e167208aa4

github-actions[bot] commented 9 months ago

Unable to find the referendum confirm event in the given block.

Instructions to find the block hash

Here is one way to find the corresponding block hash. 1. Open the referendum on Subsquare. 2. Switch to the `Timeline` tab.

--- 3. Go to the details of the `Confirmed` event.

--- 2. Go to the details of the block containing that event.

--- 2. Here you can find the block hash.

tomaka commented 9 months ago

/rfc process 0x56bfa1b49e69991617d8aed0c880101f5506e0595b06a6b97b9d5a6cc014b264

github-actions[bot] commented 9 months ago

@tomaka Handling the RFC command failed :( You can open an issue here.

rzadp commented 9 months ago

/rfc process 0x56bfa1b49e69991617d8aed0c880101f5506e0595b06a6b97b9d5a6cc014b264

github-actions[bot] commented 9 months ago

@rzadp Handling the RFC command failed :( You can open an issue here.

tomaka commented 9 months ago

/rfc process 0x56bfa1b49e69991617d8aed0c880101f5506e0595b06a6b97b9d5a6cc014b264

github-actions[bot] commented 9 months ago

The on-chain referendum has approved the RFC, however I was not able to merge the PR automatically.

rzadp commented 9 months ago

The on-chain referendum has approved the RFC, however I was not able to merge the PR automatically.

The bot seems to be missing permissions to merge the PR, although I cannot find out what exactly is missing and what needs to be changed. There were no changes to the bot since the last time it merged a PR.

rzadp commented 9 months ago

/rfc process 0x56bfa1b49e69991617d8aed0c880101f5506e0595b06a6b97b9d5a6cc014b264

github-actions[bot] commented 9 months ago

The on-chain referendum has approved the RFC, however I was not able to merge the PR automatically.

rzadp commented 9 months ago

/rfc process 0x56bfa1b49e69991617d8aed0c880101f5506e0595b06a6b97b9d5a6cc014b264

bkchr commented 9 months ago

/rfc process 0x56bfa1b49e69991617d8aed0c880101f5506e0595b06a6b97b9d5a6cc014b264

paritytech-rfc-bot[bot] commented 9 months ago

The on-chain referendum has approved the RFC, however I was not able to merge the PR automatically.

bkchr commented 9 months ago

/rfc process 0x56bfa1b49e69991617d8aed0c880101f5506e0595b06a6b97b9d5a6cc014b264

paritytech-rfc-bot[bot] commented 9 months ago

The on-chain referendum has approved the RFC, however I was not able to merge the PR automatically.

paritytech-rfc-bot[bot] commented 9 months ago

The on-chain referendum has approved the RFC, however I was not able to merge the PR automatically.

rzadp commented 9 months ago

/rfc process 0x39fbc57d047c71f553aa42824599a7686aea5c9aab4111f6b836d35d3d058162

paritytech-rfc-bot[bot] commented 9 months ago

Unable to find the referendum confirm event in the given block.

Instructions to find the block hash

Here is one way to find the corresponding block hash. 1. Open the referendum on Subsquare. 2. Switch to the `Timeline` tab.

--- 3. Go to the details of the `Confirmed` event.

--- 2. Go to the details of the block containing that event.

--- 2. Here you can find the block hash.

rzadp commented 9 months ago

/rfc process 0x56bfa1b49e69991617d8aed0c880101f5506e0595b06a6b97b9d5a6cc014b264

paritytech-rfc-bot[bot] commented 9 months ago

The on-chain referendum has approved the RFC.

polkadot-fellows / RFCs

Store parachain bootnodes in relay chain DHT #8