lightningdevkit / rust-lightning

A highly modular Bitcoin Lightning library written in Rust. It's rust-lightning, not Rusty's Lightning!
Other
1.16k stars 366 forks source link

Document consistency and availability assumption of watchtower infrastructure #604

Open ariard opened 4 years ago

ariard commented 4 years ago

Getting a fail-safe/highly robust watchtower model is harder than expected. As we encounter issue while moving forward we should keep track of them and draw a consistent model step by step.

See https://github.com/rust-bitcoin/rust-lightning/pull/597#discussion_r411795106

TODO: link #watchtower slack discussion on quorum-vs-consensus alternatives.

ariard commented 4 years ago

One of your monitor instance may crash silently for a while and you might not learn it if you don't update_monitor() due to lack of channel updates ? Your implementation of ManyChannelMonitor might force-close preemptively channel at risk in reaction.

See also https://github.com/rust-bitcoin/rust-lightning/pull/667#pullrequestreview-472057299

ariard commented 4 years ago

Watchtower Alice receives block 100, broadcasts state X, rejects state Y. Watchtower Bob accepts state Y, receives block 100, broadcasts state Y. State Y confirms onchain. Alice must be able to claim outputs. State Y is rejected by watchtower coordinator Caroll, secret for state X isn't released.

See also https://github.com/rust-bitcoin/rust-lightning/pull/667#discussion_r477570428

TheBlueMatt commented 4 years ago

Right, I think the only way to solve that pattern is to have some kind of consensus on when to broadcast a transaction - if you can't get a majority of watchtowers to agree to halt updates, then you shouldn't be able to broadcast a transaction as otherwise a majority of watchtowers could revoke the now-broadcast transaction.

devrandom commented 4 years ago

I think there are two separate animals here.

I agree with @TheBlueMatt - the HSMs embedded in each channel-monitor must achieve consensus in order to move the state forward (e.g. revoke an old state). This can be achieved with a majority voting scheme.

ariard commented 4 years ago

I like the distinction, but note that usually LN folks have used private/public watchtower for the trusted-vs-untrusted deployment. Though it hasn't been done with that much rigor. You may delegate running one instance of your distributed channel-monitor to a third-party, you may have out-of-band remedies against them, that's up to you.

That said, I think than less-than-unanimity to move state forward (i.e accept ChannelMonitorUpdateStep::LatestLocalCommitmentTxInfo) is unsafe as otherwise a subset of your monitors may broadcast previous now-revoked states. For accepting remote commitment transaction update, a majority is enough as there is no toxicity involved.

TheBlueMatt commented 4 years ago

I think after a bunch of back and forth @ariard and I are on a similar page - there are really two supported modes here (or should be) - either you get majority consensus of your monitors for each action (including broadcasting) or you get 100% consensus of your monitors for updates (but any one monitor can broadcast on its own). Still #679 makes it easier to build the second since it avoids the need to do complicated pre-consensus, allowing you to simple apply new updates to monitors and wait until all monitors have verified they've applied the update before moving the channel forward.

ariard commented 4 years ago

Describe responsibilities between a) off-chain manager b) monitor coordinator c) per-channel monitor.

ariard commented 4 years ago

For remote watchers, we may have race conditions between learning of a revocation secrets and a counterparty commitment transaction as a former can happen after the latter. Verify or do something.

ariard commented 4 years ago

Document different internal monitor backup strategies.

See https://github.com/rust-bitcoin/rust-lightning/pull/681#discussion_r498453411

ariard commented 4 years ago

We may have hints that coordinator is buggy or compromised based on our local state. If this happen, we may go onchain to avoid further risks.

See https://github.com/rust-bitcoin/rust-lightning/pull/681#discussion_r499101574