Soft opt-out - Githubissues

jtremback commented 1 year ago

Protocol Change Proposal

Summary

This specifies a modification to the protocol which allows the bottom x% of the validator set by power to opt out of validating consumer chains without being jailed or otherwise punished for it.

Problem Definition

This mechanism will alleviate a lot of the concerns about RS not being profitable enough for small validators to be able to run many consumer chains, without changing any core assumptions of RS.

Some quick napkin math:

The smallest validator in the Hub active set currently has 36,459 atoms delegated to it. At a 5% commission rate, this validator may make 414 Atoms per year in commission. At $12 per atom, that’s around $5000 per year, or $416 per month. With Cosmos nodes having typical operating expenses of $200-$600 per month, running only one consumer chain could make this small validator unprofitable.

The smallest validator in the top 80% has 1,126,938 atoms staked, for an estimated $155,517.44 in annual commissions, or $12,959.78 per month. Running extra consumer chain nodes with this budget is a very different calculation.

Proposal

To make this work, I think that all that needs to be done is to modify the HandleSlashPacket method on the provider. After the validator struct is obtained here, a calculation needs to be run to determine the validator's position in the set. If the validator is not in the top x% of the set by power, the handler should log a message and return early.

This works because on the consumer, if the QueueSlashPacket method finds that OutstandingDowntime is set, it will not attempt to send another downtime packet. OutstandingDowntime is set the first time the consumer sends a downtime packet and is unset by SlashAcks sent by the HandleSlashPacket method on the provider. But since we have modified this method to exit early if the validator is too small to slash, the SlashAck is never sent, and OutstandingDowntime is never unset.

There is one major edge case- what if a validator avoids being slashed while they are below the threshold and then gets more power and goes above the threshold? I will edit this issue with the solution.

For Admin Use

[x] Not duplicate issue
[ ] Appropriate labels applied
[ ] Appropriate contributors tagged
[ ] Contributor assigned/self-assigned
[ ] Is a spike necessary to map out how the issue should be approached?

mpoke commented 1 year ago

IMO, this could be a good first iteration, but we should allow validators in the bottom x% to opt-out. This means they are not even sent to the consumer chains. Otherwise, the problem would be that the consumers risk not getting the 67% needed for making progress. Imagine x=20%, this means that all the voting power is needed except ~13%.

Also, validators opting-out should not receive rewards. Otherwise, we incentivize Sybil validators, e.g., a large validator will spin up a bunch of smaller validators nodes that get in the bottom x%.

mpoke commented 1 year ago

This works because on the consumer, if the QueueSlashPacket method finds that OutstandingDowntime is set, it will not attempt to send another downtime packet. OutstandingDowntime is set the first time the consumer sends a downtime packet and is unset by SlashAcks sent by the HandleSlashPacket method on the provider. But since we have modified this method to exit early if the validator is too small to slash, the SlashAck is never sent, and OutstandingDowntime is never unset.

There is one major edge case- what if a validator avoids being slashed while they are below the threshold and then gets more power and goes above the threshold? I will edit this issue with the solution.

For the first iteration, I would just move this logic on the consumer chain. If a validator is in the bottom x%, do not send Slash packets. If the consumer is malicious, then the packets will be dropped on the provider.

jtremback commented 1 year ago

IMO, this could be a good first iteration, but we should allow validators in the bottom x% to opt-out. This means they are not even sent to the consumer chains. Otherwise, the problem would be that the consumers risk not getting the 67% needed for making progress. Imagine x=20%, this means that all the voting power is needed except ~13%.

I think this could be a good design. It compromises on liveness but not security. I think it's a crisp definition of the feature because it underscores that it's a different thing than true opt in, which does change the level of security. It is also self-limiting. It is impossible to have more than 33% opt out.

Also, validators opting-out should not receive rewards. Otherwise, we incentivize Sybil validators, e.g., a large validator will spin up a bunch of smaller validators nodes that get in the bottom x%.

Again, for this I think that it's good that the small one get rewards. We will have to overhaul it for full opt-in. I don't think the incentive to sybil will be that great. Instead of running a bunch of consumer chain nodes, they have to run a bunch of Hub nodes. Doesn't seem like it will benefit them much.

jtremback commented 1 year ago

For the first iteration, I would just move this logic on the consumer chain. If a validator is in the bottom x%, do not send Slash packets. If the consumer is malicious, then the packets will be dropped on the provider.

I was thinking the same thing. It also makes the feature a lot faster to ship and iterate on. I'll edit this issue later.

jtremback commented 1 year ago

@mpoke closed in favor of https://github.com/cosmos/interchain-security/issues/784

cosmos / interchain-security

Soft opt-out #765

Protocol Change Proposal

Summary

Problem Definition

Proposal

For Admin Use