keorn commented 7 years ago

https://github.com/paritytech/parity/wiki/Validator-Set#contracts

snd commented 6 years ago

since ValidatorSet.getValidators() can grow/shrink dynamically a fixed requiredSignatures doesn't make sense anymore.

we could replace requiredSignatures by majorityPercentage (66 for example) and change the check for enough signatures to: signatures.length * 100 >= ValidatorSet.getValidators().length * majorityPercentage

snd commented 6 years ago

the deposit case should be fairly straightforward:

since the validator set exists on foreign_chain and transactions on foreign_chain are free for authorities we can have authorities retry ForeignChain.deposit (if no ForeignChain.Deposit event for that deposit (hash) after n blocks) until there are majorityPercentage signatures by authorities that are in the current validator set.

retrying ForeignChain.deposit until ForeignChain.Deposit event for that deposit would also ensure deposit gets relayed in case ForeignChain.deposit transactions don't get mined

once Auras finality is exposed over RPC (parity_lastFinalized) we can have authorities retry ForeignBridge.deposit every time n further blocks were finalized that don't contain the matching ForeignBridge.Deposit. n could be 1. n would be a trade off between relay time and load on bridge processes

snd commented 6 years ago

so far i see two approaches to solving the withdraw case:

1) involves relaying all changes to the validator set from foreign to home so HomeBridge stays up to date on what to trust (cons: complex, a bit brittle, multiple race conditions to be solved, relay costs)

2) involves assembling a zero-knowledge proof on ForeignBridge so HomeBridge can verify withdraws without having to know the current validator set on foreign (cons: relies on code generation (opaque), ceremony required to keep lambda secret)

i'll write more soonish

snd commented 6 years ago

search for "validator set chain" to skip to the proposed solution

currently the bridge uses equal fixed validator sets on both sides. MainBridge can verify/trust things that are signed by a majority of that validator set.

this doesn't work anymore if the set on side can change dynamically

MainBridge needs some other information to verify incoming messages.

static information. no relay

it would be nice and simple if we could give MainBridge some piece of static information it can use to verify messages coming from side. then we would not need to relay any changes from side to main.

one could probably achieve this with zk-proofs. that's complex, requires ceremony and is not feasable.

we need to update the information MainBridge uses to verify incoming messages.

any new information needs to be verifyable by the previous information

assumptions

side is a PoA network using a dynamic validator set. side.ValidatorSet is the dynamic validator set of side. side.SideBridge is the bridge contract on side and uses the dynamic side.ValidatorSet.

main is ethereum foundation. main.ValidatorSet is a somehow synced (explained below) copy of side.ValidatorSet. main.MainBridge is the bridge contract on main and uses main.ValidatorSet.

MainBridge.withdraw(signatures, message) checks that signatures contains signatures on message by a majority of addresses in main.ValidatorSet.getValidators().

naive solution

have a process that listens to all changes to side.ValidatorSet. if a change is detected the current validator set signs off on the change. the change is relayed to main.ValidatorSet similarly to how withdraws currently get relayed. main.ValidatorSet can verify that the new validator set is signed off by the previous one (which it knows).

this introduces race conditions.

imagine that the bridge processes are offline for long enough (failure, ddos, etc) such that main.ValidatorSet has diverged from side.ValidatorSet and the majority of side.ValidatorSet that can sign off on the change is no longer a majority of main.ValidatorSet. main.ValidatorSet no longer accepts any changes. the bridge is stuck forever.

we need a better solution.

"validator set chain"

let validatorSetNumber = side.ValidatorSet.validatorSetNumber (inspired by "blockNumber") be a number increased by 1 for each change to side.ValidatorSet. let side.ValidatorSet.validatorSetNumber initially be 0.

let side.ValidatorSet.validatorSet(validatorSetNumber) be the validator set at validatorSetNumber.

let hash be keccak256.

let sign(address, message) be the signature of address on message.

let handoff(validatorSetNumber) = side.ValidatorSet.validatorSet(validatorSetNumber).map(|x| sign(x, hash(side.ValidatorSet.validatorSet(validatorSetNumber + 1)))) be the set of signatures of the validator set at validatorSetNumber on the validator set at validatorSetNumber + 1. in other words: the current validator set signing off on the next validator set.

assuming main.ValidatorSet knows side.ValidatorSet.validatorSet(validatorSetNumber). if some untrusted party provides main.ValidatorSet with side.ValidatorSet.validatorSet(validatorSetNumber + 1) then main.ValidatorSet has no way of verifying it.

if the untrusted party also provides handoff(validatorSetNumber) then main.ValidatorSet can easily verify side.ValidatorSet.validatorSet(validatorSetNumber + 1).

main.ValidatorSet.validatorSetNumber is the number of the validator set last received via handoff on main.

obviously for any main.ValidatorSet.validatorSetNumber main.ValidatorSet only accepts updates that contain handoff(main.ValidatorSet.validatorSetNumber).

handoffs chain. each handoff builds on the previous.

on each withdraw relay from side to main the validatorSetNumber is transmitted. bridge processes would wait to call withdraw until main.ValidatorSet.validatorSetNumber is high enough (validator set change has been relayed).

care must be taken that MainBridge doesn't accept messages from all previous validatorSetNumber in the past. this would open up an attack where a malicious actor that gets access to the secrets of any past validator set can take control of the bridge (make up arbitrary messages that look like they got relayed by the current validator set). more work is required. a strict solution would be to accept only those messages where validatorSetNumber == main.ValidatorSet.validatorSetNumber and retry (collect signatures again) all others.

if the relayer for the validator set updates goes down it can simply replay the entire history of handoffs once it goes back up. this can't get stuck!

if the process relaying the changes to the validator set goes down it simply pauses the bridge. once the handoffs have been relayed the bridge can resume.

pros

would work
robust as it entirely eliminates race conditions
fairly elegant

cons

requires modifications to side.ValidatorSet
- adding validatorSetNumber
- adding validatorSet(validatorSetNumber)
- requires keeping a history of all validator sets
- adding handoff(validatorSetNumber)
- requires collecting the handoffs

open questions

transaction on main that relays a single handoff requires mainnet ether funds to pay for gas.

who pays for gas?

since the validator set doesn't change too often we can run the relay and top it up if needed.

rphmeier commented 6 years ago

requires collecting the handoffs

In particular, this will require writing some kind of daemon for validators to run to create the signatures.

The double-handoff problem you touch on briefly is a little worse than you describe:

Accepting the first handoff relayed to the home chain as canonical causes a race between malicious foreign-chain validators and the rest of the world. So instead handoffs should be accepted for some time period after the first for an epoch to reduce timing constraints.

There might be an incentive misalignment similar to the nothing-at-stake problem: the message can only safely be relayed once the change of validators on the foreign chain has been finalized. But once that change is finalized, those validators no longer have an obligation to behave honestly. So there isn't much reason for them not to give a handoff of a bad set or not give one at all. What we could do is

allow anyone to relay handoffs to the home chain within K home-chain blocks of the first. i.e. if the first handoff is relayed at block N, the contract will still be listening for handoffs until block N + K.
allow any publicly known handoffs to be published to the foreign chain's validator contract for the next LOGOUT_PERIOD blocks, where LOGOUT_PERIOD is designed to be much longer than K (accounting for block time differences -- might be hard)
validators from the last period cannot withdraw their stake until LOGOUT_PERIOD blocks have passed on the foreign chain. if more than one handoff is published, one must be invalid and they must have misbehaved. in this case, they will not be allowed to withdraw.
provide some kind of reward to the relayer of a good handoff to the home chain.
provide some kind of bounty to the reporter of a bad handoff to the foreign chain.

I think this establishes an equilibrium where the validators are incentivized to create only the single valid handoff. Unfortunately creates a delay of K blocks to update the foreign validator set on the home chain.

I haven't thought of a recovery strategy for what can happen if two competing (but both "valid") handoffs are relayed to the home chain. Followers of the foreign chain can reconcile the fork under a weakly-subjective security model, as most of the network will already have finalized the first handoff and will refuse to reorganize to the second. But the main chain is incapable of checking finality proofs on the foreign chain, so it can't reconcile the fork. although even if we had a smart contract light client of the foreign chain I don't think the issue is solved -- both forks are "at-a-glance" valid and it becomes a race between the malicious validators and the rest of the world again.

Given that that scenario can only occur iff the current validator set are irrational (at least according to this equilibrium) I think we can ignore it for now. But a mitigation strategy would still be really useful in case it doesn't hold against second-order concerns.

paritytech / parity-bridge

Authorities should depend on a ValidatorSet contract #33

static information. no relay

assumptions

naive solution

"validator set chain"

pros

cons

open questions