eigerco / polka-storage

The Polka Storage Parachain Project
https://eigerco.github.io/polka-storage-book
Apache License 2.0
2 stars 0 forks source link

RFC: Collator Power Economy #61

Closed th7nder closed 5 months ago

th7nder commented 5 months ago

Collator Power Economy

Proposal

Collators are separate from Storage Providers. They are mostly sponsored by Polkadot's treasury and governed by the Fellowship. There is a fixed list of Invulnerables selected by the governance and on top of that, a fraction of Collators that anyone can become by staking DOTs and running a Collator node. This is no different from other System Parachains (i.e. AssetHub, BridgeHub). As an extension to that, Storage Providers can as well stake their DOTs on Collators, to help them win a slot in a session. Other network participants should not have an option to influence the Collator selection.

The Collator Pool should approximately have:

Context

Current System Parachain Collators are sponsored by the treasury and Polkadot's governance, as there is little to none economic incentive to run a Collator Node - no inflationary reward system. They are always running at loss, because they're expensive, resource-intensive and fees earned from collating blocks are not substantial. Our system parachain is no exception. In addition to Polkadot Treasury's sponsored Collators, anyone can also become a collator if they want to and bond funds to participate in Candidates election.

The usual reasons for setting non-treasury backed Collator node are valid, so we can imagine a scenario where a Storage Provider wants to have a say in including their transactions in the blocks. As running Storage Provider is expensive enough, they should have a possibility to stake their funds to support their trusted collator. The main incentive for that is not earning rewards from block candidate production, but just making sure there is a Collator on their side which is not censoring transactions.

Alternatives

NOTE: each of those approaches assume that amount of storage space a Storage Provider provides DOES NOT directly affect staking or collator selection (in contrast to the Crust's GPoS and FileCoin's consensus). It can affect it indirectly, i.e. when a Storage Provider has lots of deals, hence lots of tokens, so they can stake it on a Collator.

1. There are invulnerables, but only Collators can stake tokens, they cannot be nominated.

If Storage Providers weren't able to stake tokens and didn't trust the network's invulnerables, they'd have no other option but to run their own Collator Node. There is a limited number of slots for Collators, so Storage Providers would compete between each other and need to stake lots of tokens for little to no guarantee.

2. There are no Invulnerables, collators are run by the community.

It would be a charity work, no one wants to become the Collator in this model. If Storage Providers want to earn money, they'd need to spin up Collator Nodes to make sure the network is functioning and have all of the drawbacks of alternative 3.

3. There are no Invulnerables, each Storage Provider runs as Collator.

There is no economic incentive for Storage Provider's Collator Node to be honest and include everyone's transactions. Running a Collator Node is expensive and the rewards for that are minimal, so each Storage Provider would only include their own transactions. The biggest Storage Providers would centralize the power.

FAQ

1. Who is a Collator?

Collators are network participants which produce block candidates to be backed by validators. Collators run both a relay chain full node and a parachain full node. They do not concern themselves with finality - a decision whether the block will be definitely included in the blockchain. Their only role is to produce parachain block candidates. The security is delegated to relay chain validators. Producing parachain block candidates means gathering the transactions from the gossip across the parachain nodes and collating them into a block.

A decision about which collator produces a block and communicates with the relay chain boils down to block authoring algorithm. Example block authoring algorithms are: BABE, AuRa. The algorithms solve the distributed consensus problem and selecting a node which will be producer of a given block. Each block may have a different block author. However the block authoring algorithms require having a set of validators (collators) before they start the election. There needs to be a pallet, that feeds that data into the algorithm for it to be able to select the block producers.

Those algorithms do NOT detect misbehaviours in the network, e.g: trying to include a malicious block with double spending. They cannot know that, as they're only selecting a block producer. The selected block producer (Collator) is forwarding the block for validation to the relay chain's validator.

Validator validates a block by running a parachain's state transition function (runtime), on their own and confirming whether the business logic contained in the block (they execute block's transactions) is sound, valid and according to the Proof of Validity. If the block is correct, then it must be backed up by a majority of validators and approved. If the block is incorrect, then it won't pass validator's backing and approvals and won't be included as an available parablock. It's simply discarded, so the next transaction won't use it as a parent.

If a relay chain's validators back up a malicious block, they are slashed by majority of validators, losing part of their stake and eventually being kicked out.

We'd like to prevent DoS attacks on the network where parachain is not able to progress/censors transactions, because some Collators are constantly elected because of their high stakes. In case of other System Parachains it's solved by having a list of governance elected Collators - Invulnerables, which are trusted to be behaving correctly. A different mechanism to prevent those used by some parachains with inflationary reward system is indirect, but effective one. They keep track when was the last time a Collator authored a block, if it did not happen in last x hours, we kick him out and slash 1% of this DOTs.

2. Why running too many collators is a bad idea?

  1. To produce a block candidate, each Collator needs to gather transactions. If there are a lot of transactions happening and there is a congestion, many of the collators may not receive some of the transactions. They'd not have agreement on how the chain should look like, so it'd slow down the network before it finally settles on the valid chain after receiving blocks from the relay chain. It can be mitigated with collator selection logic, at each point in time there may be a set of Candidates from which we select a Collator set for each session and then the number of active collators would be reduced.

  2. We assume there will be lots of Storage Providers in the parachain, so there'd be lots of Collators. If each Storage Provider was running a Collator then they'd need to have very powerful machines which might be the blocker, as Collators on their own are resource intensive. It'd complicate things for our clients, maybe to handle those they'd need to run a fleet of nodes, not just a node. We cannot know at this stage 7, 8.

3. What's the difference between BABE and AuRa

Both BABE and AuRa are block authoring algorithms.

AuRa is an algorithm that has a set of validators and selects them in round-robin fashion. A set of validators must be known before each session starts, then the time is divided into slots, where for each slot a block producer is selected.

BABE also has a set of validators, but additionaly each validator is assigned a weight and this weight is combined with a VRF to decide whether block is produced.

Overall both work, BABE is advertised as more scalable and secure. However we got security and scalability provided by the relay-chain. Honestly, it's not a blocker, as those can be switched out whenever and they are considered separate to rest of the mechanisms. We can go with AuRa (as AssetHub does, another system parachain) and later replace it.

4. What does option --collator on a parachain node do?

--collator implies --validator. When we run a node in --collator mode, it's role is set to authority. It runs both full node for the relay chain and full node for the parachain. When the node is run as authority, collator it runs a Collator Service which works as a proxy between parachain and the relay chain.

5. Why only Storage Providers should be able to stake tokens on Collators?

Storage Providers may have a need reason to stake on Collators to make sure their Storage Deals are being included. There is no economic incentive to to that, as the rewards from blocks are not substantial. The only reason other network participants would want to nominate Collators is to introduce some kind of malicious Collator, and not running their own node. It doesn't make much sense and staking only by Storage Providers is an additional safeguard.

cernicc commented 5 months ago

I think that mistake is thinking that the amount of storage should not affect the result of consensus. The biggest players of the systems should be the ones being constantly checked. As an addition there should be some number of small guy in the group so that they have incentive to not lie because the small guys would always validate their actions. By that I think we could say that each storage provider is also a collator. The specific collator selection could be made on some specific interval. Based on the size of the storage and quality of it provided to the network in previous iterations.

The idea behind adding a set of smaller guys in the set is because you assume that there is a possible way for the big guys to communicate in advance to lie and cheat the system. If you add some set of random smallest guys, the big guys action is not impactful to the system, In our case that the big guys are the ones bringing largest amounts of storage. You don't want to incentivize bringing no storage and a lot of funds. With that approach you get a slow and unreliable storage network.

You basically need storage power because you want an incentive for storage providers to provide as much storage as possible and that storage needs to be reliable and fast.

It's great that we are building the greatest storage marketplace on polka but it would be even better if we can create a marketplace that knows how to score performance of the provided storage and as a result the corresponding providers 😄 You want to get rid of the bad storage from the network. If you want for the network to have a good reputation and be used 😄

If you link the power to the funds you also can get so that each of the providers can be a collator. And if the biggest are always checked. Their storage quality and price/size should be the best if not they won't be the biggest for long 😄

Hmm. I think I finally understand Polkadot. Relay chain is only validating if the collators of some parachain are trying to move their internal state truthfully

EDIT: Sorry. Removed those --- :D

serg-temchenko commented 5 months ago

...A decision about which collator produces a block and communicates with the relay chain boils down to block authoring algorithm. Example block authoring algorithms are: BABE, AuRa....

Again, we are coming back to the same confusion. As described in the official documentation:

BABE (Blind Assignment for Blockchain Extension) is the block production mechanism that runs between the validator nodes and determines the authors of new blocks.

The important part here is "validator nodes", which are on the relay chain side. We need to disregard BABE in the context of the collator selection mechanism, since BABE is a block production mechanism used on the Polkadot Relay Chain to determine which validators are responsible for producing new blocks.

serg-temchenko commented 5 months ago

Aura also is something we can't use. Since Aura uses a round-robin mechanism for block production, where collators produce blocks in a fixed cyclic order, which means equal participation among collators, which is not what we need. We need custom consensus, which might combine pieces of any of them, but follow our custom logic.

serg-temchenko commented 5 months ago

There is a way to slash validators, as we'd like to prevent DoS attacks on the network, without slashing

Super confused with this "There is a way to slash validators ... without slashing". Also, how we, parachain can slash validator?

serg-temchenko commented 5 months ago

IMO we need again get back to this section. We have there reference to this substrate section and this collator selection pallet which we can adjust according to our needs and use it (this pallet actually was used in our POC with some adjustments) and this pallet also recommended by substrate:

Stake voting The Cumulus collator-selection pallet is a practical example on implementing stake voting to select collators.

We can't use AuRa or BABE since they are not follow our needs and we can't adjust it to our needs.

serg-temchenko commented 5 months ago

I agree that the chances are low for a Storage Provider to also be a Collator. You reminded me that it would require running the full relay chain, which could amount to terabytes of data.

th7nder commented 5 months ago

Aura also is something we can't use. Since Aura uses a round-robin mechanism for block production, where collators produce blocks in a fixed cyclic order, which means equal participation among collators, which is not what we need. We need custom consensus, which might combine pieces of any of them, but follow our custom logic.

Actually, we can. It's the way of the parachains:

The practical example linked (collator-selection-pallet), works like that. On each session collator-selection-pallet selects top 5 most staking collators, and then puts them into AuRa, which produces blocks in round-robin fashion. It doesn't matter much whether we use AuRa or BABE for this, they work similarly.

BTW. I'm in the process of updating the proposal, as I found a groundbreaking RFC which clarifies a lot. https://polkadot-fellows.github.io/RFCs/approved/0007-system-collator-selection.html https://polkadot.polkassembly.io/referenda/288

Basically, Collators in System Parachains are not incentivized economically to be run (they always run at loss), so Polkadot fellowship sponsors them.

th7nder commented 5 months ago

Super confused with this "There is a way to slash validators ... without slashing". Also, how we, parachain can slash validator?

Good catch, some remnants from my dirty notes. Disregard this.

serg-temchenko commented 5 months ago

Aura also is something we can't use. Since Aura uses a round-robin mechanism for block production, where collators produce blocks in a fixed cyclic order, which means equal participation among collators, which is not what we need. We need custom consensus, which might combine pieces of any of them, but follow our custom logic.

Actually, we can. It's the way of the parachains:

The practical example linked (collator-selection-pallet), works like that. On each session collator-selection-pallet selects top 5 most staking collators, and then puts them into AuRa, which produces blocks in round-robin fashion. It doesn't matter much whether we use AuRa or BABE for this, they work similarly.

Okay, I can see the problem with this document and the missing component in this approach. It's not talking about the pallet that will use Aura, and it gives the impression that Aura is all we need. If I remember correctly, you mentioned in our previous meeting that we can basically use Aura without needing to implement anything extra. If Aura could or should be used as a post-selection and coordination mechanism, then fine. However, please provide information about the layer that will define the set of collators to be passed to Aura, for example.

th7nder commented 5 months ago

...A decision about which collator produces a block and communicates with the relay chain boils down to block authoring algorithm. Example block authoring algorithms are: BABE, AuRa....

Again, we are coming back to the same confusion. As described in the official documentation:

BABE (Blind Assignment for Blockchain Extension) is the block production mechanism that runs between the validator nodes and determines the authors of new blocks.

Yeah, I phrased that badly. The idea is that AuRa/BABE doesn't matter, they're just a tool. We'll be passing them a set of validators (collators) from another pallet (like collator-selection-pallet, which will decide the power and so on) - and then they'll do round robin/random VRF.

th7nder commented 5 months ago

I think that mistake is thinking that the amount of storage should not affect the result of consensus. The biggest players of the systems should be the ones being constantly checked. As an addition there should be some number of small guy in the group so that they have incentive to not lie because the small guys would always validate their actions. By that I think we could say that each storage provider is also a collator. The specific collator selection could be made on some specific interval. Based on the size of the storage and quality of it provided to the network in previous iterations.

They are being checked by the relay chain and it does not matter how much power they have. The Validators are checking them, as the Parachain logic is included in the WASM runtime that they are executing.

The idea behind adding a set of smaller guys in the set is because you assume that there is a possible way for the big guys to communicate in advance to lie and cheat the system. If you add some set of random smallest guys, the big guys action is not impactful to the system, In our case that the big guys are the ones bringing largest amounts of storage.

That's totally correct. In the case of the parachain, the risk is that the Storage Providers being Collators would censor other transactions, so preventing other smaller players to win any deals. But that's the most they can do, as 'lying' will be detected by the Validators in the Relay Chain.

You don't want to incentivize bringing no storage and a lot of funds. With that approach you get a slow and unreliable storage network.

That's right, being a Collator on its own and bringing a lots of funds is not incentivized. As Collators make a fraction from the block rewards in comparison to what they need to pay to support the machines.

You basically need storage power because you want an incentive for storage providers to provide as much storage as possible and that storage needs to be reliable and fast.

And this incentive is earning tokens from Storage Deals with Clients, providing Proofs every time they are required to. If they do not, their tokens are being slashed.

th7nder commented 5 months ago

We established we use economic model of the system parachains based on Polkadot's RFC-007.

Hence using collator-selection pallet which implements this mechanism.

Additionally, we'll consider mechanism tipping mechanism in the future (#73).