Open theo-zil opened 1 year ago
Upon further discussion with @DrZoltanFazekas, the shared PoS security model requiring shards to get validated by a majority of the network actually makes me question my assumptions as to how exactly should shards be ran. Specifically pertaining to question 1
above, "what is the communication method" - if we assume separate nodes/clients, that's great for PoA, but even just having 100 PoS shards would become extremely cumbersome then if validators need to run 100 different clients.
We'd need to try to consider a way to run as many nodes as possible concurrently and efficiently, to allow validators to participate in as many shards as possible. That probably means running them in-process, for starters, and may need even more thought put in to make feasible.
This sounds like it'll likely add significant complexity over the PoA-only shards.
Great summary above @theo-zil
I have a comment on
Connecting to the Main Shard itself may or may not be optional
I assume connecting to the main shard will be mandatory in order to retrieve the validator sets of the bridged-from shards which is necessary to check if the block headers of those shards are valid (i.e. co-signed by 2/3 of those shards' validators)
That's a great point, edited my summary.
1.a is a tradeoff between latency of cross-shard messages and liveness of the shard's consensus, as mentioned in the RFC. By introducing a delay (i.e. the block proposer on shard S includes a block h' from shard S' that was finalized a few slots ago instead of the "newest" block h'' that was finalized in the last slot of shard S') we can give all validators more time to get notified about block h' and vote on the proposal referring to it on shard S.
If validators do not run a full node of the bridged-over x-shard but only a light client which connects to a untrusted full node, the notifications about the subscribed transactions must carry Merkle proofs of those transactions. This is part of the light client protocols because light clients only know the block headers which contain a Merkle root, and must verify if the transactions the full node notifies them about are indeed included in a block based on the Merkle root in that block and the Merkle proof received along with the transaction.
Indeed that's true, the light client won't trust the full nodes. Your x-shard validator will trust the light client so your x-shard can still eschew all proofs, you just gotta make sure the light client verifies them.
Note that another option is running a light node, which effectively follows consensus, and simply doesn't participate in voting and does not store any history except block headers. But they will observe directly the votes for all new blocks, and will hold a mempool of transactions and will observe those transactions getting included into blocks, so don't need separate proofs. In our case that will be quite convenient since x-shards only care about transactions in the latest block, so a node that can effectively receive all new blocks from the p2p network and notify the validator of new transactions - but without using any storage or needing any stake/authority - will work perfectly fine. The advantage is that it avoids having to configure the "light client" to connect to specific fullnodes and trust them, as the light node can observe the consensus happening directly, so transactions cannot be censored from it by any individual node.
Ok, in my definition light client = lightweight node. It only keeps track of block headers but not the transactions in the blocks (otherwise it would be a full node and not light node). It needs to request the transactions (via subscriptions) from either a trusted full node without Merkle proofs or a trustless full node with Merkle proofs.
I'm not sure if a lightweight node maintaining the transaction pool of every bridged-over x-shard is light enough or it's actually already a full node as it does not only have the block headers but the transactions too. Btw, a full node is not necessarily a validator i.e. it does not have to participate in the consensus.
Bear in mind, FWIW, that not every X-Shard will be trustable, so just because your X-Shard says "transfer USDC$1m to Fred" doesn't mean you should do it. A corollary is that bridges between X-Shards are untrusted.
I don't anticipate PoS shared-security XShards being of much use. I would think most XShards will be secured by their own independent staking or PoA - I'll write an RFC on this, so I don't think many validators will be handling very many XShards at the same time - that said, validators will tend to want to do this up to their performance limit so as to claim the rewards for doing so.
Would be great to have some clarity on this because the way I see it, whether we want shared PoS or not might significantly affect the complexity required. If we go for independent security, and eschew shared PoS, then my list of tasks above should lead to a working deployable PoC and feels very achievable in maybe a couple of months or so.
Way I see it, which security model we use will depend more or less entirely on the business cases for our chain, so @rrw-zilliqa I assume you'd have the most visibility on what we really need here. Looking forward to reading your thoughts in the RFC.
Yes - sorry; you are quite right. Annoyingly, I am schmoozing for most of the rest of the day, but will write up ASAP.
The take-away will be:
So basically, the rules about validation for XShards are programmable. You could implement full security staking as @DrZoltanFazekas suggests, or PoA, or base the shard's security on how many clown costumes each validator owns...
(whilst killing a Wendigo, obv)
The agreed sharding strategy is as follows:
There are three outstanding questions here:
At a high level, the following then needs to be implemented to enable at least PoA sharding:
Timeline: I estimate that at least the subscription mechanism should be completed in the near future, so we can start using an actual sharded architecture. On the face of it, my very rough estimate is that all four steps should be doable at least in a PoC shape before testnet in September, assuming we don't discover that we require significant additional complexity.