chainflip-io / chainflip-backend

The Chainflip backend repo, including the Chainflip Node and CFE.
50 stars 15 forks source link

[CH-1413] Signing module should report parties failing to sign a message to SC #222

Closed msgmaxim closed 2 years ago

msgmaxim commented 3 years ago
dandanlen commented 3 years ago

Adding the state chain label since I think this is currently not supported.

kylezs commented 3 years ago

@msgmaxim @j4m1ef0rd Can this be closed? Edit: NVM, this includes the SC work, and this isn't supported yet.

morelazers commented 3 years ago

@dandanlen is State Chain support for this coming in #414 ?

dandanlen commented 3 years ago

Yes, that's the plan, although it might require some extra work to integrate with reputation / liveness in order to have any effect.

dandanlen commented 3 years ago

I've created a clubhouse story and updated the issue - @andyjsbell just to confirm - this will require additional work on vaults / validators, right?

andyjsbell commented 3 years ago

Left comment in CH_1859

This is already part of the vaults request/response process flow and the bad_validators are reported back to the auction pallet in which they are skipped in the next auction.

dandanlen commented 3 years ago

This is about signing in general, not just in the context of a keygen / vault rotation.

andyjsbell commented 3 years ago

oops, ok. Could we get more detail on intentions and I can comment? :smile:

dandanlen commented 3 years ago

When a threshold signature fails, the Signer reports the guilty parties, and we need to ensure that we act accordingly.

Currently we have OfflineConditions::ParticipateSigningFailed so I guess we can report this, but it doesn't have any effect other than lowering the reputation of the offenders. We need to somehow tie this in to how we select signers for a given threshold signature ceremony.

Either we rely on the reputation system for this (i think this is the plan?), or we explicitly exclude offenders from upcoming signing sessions.

Either way, we need some way of selecting a set of signers.

I think this relates to the other issue #445 around selecting a signer node.

morelazers commented 3 years ago

Certainly the plan is to rely on the reputation / online system. Meeting any of the Offline Conditions should result in the Validator's status being set to Offline, and them being ineligible for selection for signing ceremonies and output broadcasts until they are considered "Online" again by whichever pallet is governing the Validator's Online status.

We should assume that a failure to complete your responsibility is a result of you being offline, which should result in a Reputation hit.

andyjsbell commented 3 years ago

We currently are aware of a validator's liveness and the number of reputation points they have and this results in slashing if negative. We can 'ban' a validator for a period of time, which I assume is in the intention here, by setting their reputation points to a negative amount and excluding them in actions as those described here until they have a positive balance of reputation points. This would also mean they would be slashed during this period.

Are we maybe looking for some more control here for 'banning' a validator? I think using the term here 'offline' is misleading as we really want to exclude the validator for a set period of time from certain aspects of the network and they aren't offline as such as they would be submitting heartbeats to earn reputation. Reputation(Slashing) and liveness are two other concerns which should not necessarily muddied with banning.

morelazers commented 3 years ago

Yes we want to timeout the Validator without being forced to set their Reputation to negative.

dandanlen commented 3 years ago

Just want to add that the drawback of the reputation penalty -> offline mechanism is that the validator's status doesn't change until the next block when the hook is triggered to check their reputation levels. So unless we explicitly ban a validator, they can still be selected during the same block.

andyjsbell commented 3 years ago

Good point. We need to have another measure, which we can term "ban", for a said period and is immediate in effect.

morelazers commented 3 years ago

IMO this is not a drawback of the proposed mechanism, but a drawback of its implementation via an interval, but I think I've spent enough time on that soapbox already.

dandanlen commented 3 years ago

Haha yes.

Actually I think I'm wrong in that reputation doesn't cause a validator to go offline, it causes a validator to be slashed, which is a separate thing. At the moment the online/offline status purely determined by heartbeats.

Anyhow... As I've mentioned before I think the issue stems from confounding online/offline with being banned/untrusted. We need to track the former in order to correctly handle heartbeats (and this does require a hook/interval to check for the absence of a heartbeat). But it should be possible to be 'online' (ie. submitting heartbeats and re-building your reputation) yet nonetheless excluded from participating in signing.

Currently we track the online/offline status for heartbeats but committing an offense doesn't impact this (correct, @andyjsbell ?) - and IMO it shouldn't otherwise it will interfere with heartbeats.

I agree we ought to have an explicit state banned that is triggered when a node commits an offense and reverted after some predetermined interval, or when reputation recovers above some level.

Also IMO "failing to submit a heartbeat" should be an offense and triggered explicitly like any other offense via the report method.

andyjsbell commented 3 years ago

Anyhow... As I've mentioned before I think the issue stems from confounding online/offline with being banned/untrusted. We need to track the former in order to correctly handle heartbeats (and this does require a hook/interval to check for the absence of a heartbeat). But it should be possible to be 'online' (ie. submitting heartbeats and re-building your reputation) yet nonetheless excluded from participating in signing.

Yes, I think we concur on this one :+1:

Currently we track the online/offline status for heartbeats but committing an offense doesn't impact this (correct, @andyjsbell ?) - and IMO it shouldn't otherwise it will interfere with heartbeats.

Yes. As my point above we should treat liveness and penalties separately and include a state of ban as a period of time the validator can't contribute to the protocol. They would be liked to a backup validator, they would only be able to submit heartbeats to reputation, could this mean they only receive the same rewards as a backup validator?

Also IMO "failing to submit a heartbeat" should be an offense and triggered explicitly like any other offense via the report method.

Maybe the fact they lose 15 reputation points for not doing so is enough as a penalty but there are possibilities to say report an offence for being let's say 10 intervals offline. Just an idea.

dandanlen commented 3 years ago

Yes. As my point above we should treat liveness and penalties separately and include a state of ban as a period of time the validator can't contribute to the protocol. They would be liked to a backup validator, they would only be able to submit heartbeats to reputation, could this mean they only receive the same rewards as a backup validator?

Pretty sure the answer to this is no - they don't change status, they keep earning the same rewards for being a validator, and we expect them to continue witnessing etc, but they won't be selected for signing ceremonies, and the might be slashed, depending on their reputation.

Also IMO "failing to submit a heartbeat" should be an offense and triggered explicitly like any other offense via the report method.

Maybe the fact they lose 15 reputation points for not doing so is enough as a penalty but there are possibilities to say report an offence for being let's say 10 intervals offline. Just an idea.

What i'm saying is that instead of treating missing a heartbeat as a special case (ie. applying the penalty directly), it should be one of the OfflineConditions with an associated penalty (15 points or whatever).

morelazers commented 3 years ago

Kindly reference (or comment on) the Notion documents re. this issue. That is where the mechanism specification is currently fleshed out.

Reputation Penalties[1] are the only things that should cause a Validator to lose rep directly.

Offline Conditions[2] should force the Validator into the TemporarilyBanned state (name TBD) where they are effectively considered "Offline" by the protocol in the context of their active duties (and suffer rep penalties as a result of the "Offline" condition in [1]).

[1] https://www.notion.so/chainflip/DRAFT-Reputation-8ce9f16c3eae49d0ba50e887497fb017#b7f739c9663e4ab39136b1c125294504

[2] https://www.notion.so/chainflip/DRAFT-Offline-Conditions-bac2c334234f4f74bdb0197073e31186#922af7f3c69340758e8951a851442966

dandanlen commented 3 years ago

Ok.

In the Notion we have these offline conditions:

In the implementation we have these:

/// Conditions as judged as offline
#[derive(Encode, Decode, Clone, PartialEq, Eq, RuntimeDebug)]
pub enum OfflineCondition {
    /// A broadcast of an output has failed
    BroadcastOutputFailed,
    /// There was a failure in participation during a signing
    ParticipateSigningFailed,
    /// Not Enough Performance Credits
    NotEnoughPerformanceCredits,
    /// Contradicting Self During a Signing Ceremony
    ContradictingSelfDuringSigningCeremony,
}

Offline conditions have associated penalties. We also deduct penalty points when a heartbeat is missed.

Ignoring for a moment that that the above lists don't match up...

I'm suggesting harmonising the approach to make things more consistent - adding MissedHeartbeat to the offline conditions instead of treating it as a special case.

morelazers commented 2 years ago

Closing due to the completion of all checklist items in the OP.