ACINQ / eclair

A scala implementation of the Lightning Network.
Apache License 2.0
1.24k stars 266 forks source link

eclair can go below local reserve when updating fees #2899

Closed DerEwige closed 2 months ago

DerEwige commented 2 months ago

eclair: version=0.10.0 commit=a63d2c2

I suffered a force close yesterday. You find the logs and the channel export in the attached.

tuganode_fc.log tuganode.json

It looks like this is what happens: I received an inbound HTLC, this pushed my local funds down to 10’068 sats due to increased commitment This is fine as the reserve is 10’000. Then eclair decides to increase the fees. Which pushes local funds below local reserve. My peer (LND 0.18.x) responded with “invalid commitment” which led to a force closure

This seems to be related to this topic: https://github.com/lightningnetwork/lnd/issues/8249 But still I find it unwise of eclair to push below fee reserve with a fee update.

t-bast commented 2 months ago

But still I find it unwise of eclair to push below fee reserve with a fee update.

Eclair does not send an update_fee that would make it go below the reserve. The issue is rather the lnd issue that you linked: when eclair generates the update_fee, it is perfectly valid and doesn't make it go below reserve. But lnd is concurrently adding an HTLC, and it's the combination of the outgoing update_fee and the incoming update_add_htlc that makes eclair go below its reserve.

This is exactly the scenario why lnd must allow eclair to go below the reserve: since the protocol is asynchronous, each peer cannot know that the other peer is preparing a concurrent update, so there is no way to avoid entering that state. Thus we should allow being in that state instead of force-closing like lnd does, which is wasteful.

DerEwige commented 2 months ago

If the inbound HTLC and fee update would have happened in parallel I would agree with you.

But in this instance we received the inbound UpdateAddHtlc 0.2s before eclair sent out the UpdateFee. It is not the case that those 2 messages where on the wire simultaneous.

Or am I missing something?

t-bast commented 2 months ago

But in this instance we received the inbound UpdateAddHtlc 0.2s before eclair sent out the UpdateFee. It is not the case that those 2 messages where on the wire simultaneous.

That doesn't mean that eclair hasn't already gone through the process of creating the update_fee. This has to be what happens, because otherwise eclair wouldn't create the update_fee because of this check:

https://github.com/ACINQ/eclair/blob/c440007b527442921c82d170b73d8a3de08f897a/eclair-core/src/main/scala/fr/acinq/eclair/channel/Commitments.scala#L566

DerEwige commented 2 months ago

So eclair only registers the inbound htlc once the flow for it finished. So eventhough the messages did not go out in parallel the work flows went in parallel

I believe I've marked the messages belonging to the added HTLC in the logs

image

So it took 0.3s to process all the messages related to the added HTLC and in this time window eclair decided to update the fee. Got it

ziggie1984 commented 2 months ago

We will have this check in LND 19, however I want to point out that eclair could also look ahead in their msg queue for non locked in update_htlc_msges and account for them as well when trying to send a fee_update, or would this not have prevented the problem ?

ziggie1984 commented 2 months ago

moreover regarding the incoming htlc which might dip the remote below the reserve, does eclair check that the incoming htlc does pay more that the additional fee-costs, or is there no attack vector in it ? I mean at the end both peers what the channel to work, adding a bunch of incoming htlcs just to drain the remote reserve seems to be not in line with the incentives I guess ...

t-bast commented 2 months ago

We will have this check in LND 19

Great, thanks!

eclair could also look ahead in their msg queue for non locked in update_htlc_msges and account for them

This is already partially done, and doesn't entirely get rid of the issue anyway: we may send update_fee and receive the concurrent update_add_htlc right after that. Adding more "hacks" to mitigate the issue will risk creating more issues (e.g. on update replay during reestablish) so I'd rather wait for lnd to fix this issue than to add dangerous temporary band-aids.

does eclair check that the incoming htlc does pay more that the additional fee-costs, or is there no attack vector in it ?

I'm not sure what you mean here. Anyway, our peer chose to send us update_add_htlc, so we only have two choices: wait for the HTLC to be signed before deciding whether to relay it or fail it, or force-close. If the HTLC passes the usual requirements, I don't see why we'd want to force-close?

ziggie1984 commented 2 months ago

An additional question I have in regards of the eclair behaviour:

We in LND keep always room for an additional htlc when determining the fee_update value, does eclair not do it, otherwise this htlc would not have happened. In my understanding its not really necessary to let the peer dip us below the reserve only because we wanted to make a feeupdate.

Implementing this feature right now, and I need to decide whether a fee_update from a peer would be accepted when deciding whether a peer is allowed to dip into its reserve, and I am undecided:

Following scenario: PeerA(channel opener) <=> PeerB

From the perspective of PeerA:

  1. Adds a local fee_update to the state signs the commitment for the peer (Balance of Peer A above Chan-Reserve)
  2. Receives an HTLC from PeerB concurrently which when taking the above new fee-msg into account dips PeerA below reserve.

=> this case seems ok, because we should only allow a peer dipping into the reserve when it's an incoming HTLC

From the perspective of PeerB however:

Peer B receives the Update_msg concurrently while adding a new outgoing HLTC, and for PeerB this looks like PeerA is dipping onto its reserve because of updating fees and that should be avoided?

ziggie1984 commented 2 months ago

I think the requirement for dipping below the remote reserve should be as follows:

The receiver of an HTLC should be allowed to dip into its reserve if the new state has only 1 new incoming htlc.

The Sender of an HTLC should be allowed to dip the remote peer into its reserve if the new state has only 1 new outgoing htlc.

Essentially we want to capture the Splicing case here or ?

Maybe we can require that only 1 htlc is allowed on this state when we would allow the dip into the reserve ?

However this would not include the case of unstucking channels ?

t-bast commented 2 months ago

We in LND keep always room for an additional htlc when determining the fee_update value, does eclair not do it, otherwise this htlc would not have happened.

No, we don't keep room for HTLCs when creating an update_fee: since the channel feerate is (unfortunately) a very critical security parameter, we use as much as possible.

From the perspective of PeerB however: Peer B receives the Update_msg concurrently while adding a new outgoing HLTC, and for PeerB this looks like PeerA is dipping > onto its reserve because of updating fees and that should be avoided?

But PeerB can detect that we're in the case where the HTLC was added concurrently to the fee update, by looking at the signing/revocation state. So it can only allow dipping into the reserve when it happens because of concurrent HTLCs.

The receiver of an HTLC should be allowed to dip into its reserve if the new state has only 1 new incoming htlc.

You can't restrict this to 1 HTLC, otherwise that means you need to always restrict the channel to one unsigned outgoing HTLC, which isn't reasonable. When sending an outgoing HTLC, you don't know that your peer may be sending a concurrent update_fee, so why would you restrict yourself to a single HTLC? What would be the heuristic for when you do this compared to when you don't?

ziggie1984 commented 2 months ago

You can't restrict this to 1 HTLC, otherwise that means you need to always restrict the channel to one unsigned outgoing HTLC, which isn't reasonable. When sending an outgoing HTLC, you don't know that your peer may be sending a concurrent update_fee, so why would you restrict yourself to a single HTLC? What would be the heuristic for when you do this compared to when you don't?

So basically the current requirement for eclair is the following:

Allow the dip into the reserve of the remote peer if the "NEW" state has at least 1 new outgoing HTLC Allow the dip into the reserve of the LOCAL reserve if the "NEW" state has at least 1 new incoming HTLC ?

Do you make sure that at least an output is remained for the peer which is dipped into its reserve ?

ziggie1984 commented 2 months ago

TBH I think we should specify this in the specification as a "Should" ?

t-bast commented 2 months ago

Eclair's behavior is currently:

With that behavior, we're able to avoid force-closing during this kind of race condition.

Do you make sure that at least an output is remained for the peer which is dipped into its reserve ?

No, we don't check that. We could, in theory it would be better, but it adds complexity and actually exploiting this would be hard, so I'm not sure it's worth the effort.

TBH I think we should specify this in the specification as a "Should" ?

Agreed, any clarification to the spec would be useful! Unfortunately this is actually quite hard to specify in a friendly way...I kind of gave up trying to specify some of the edge cases created by concurrent update_fee messages (even though they're the main source of bugs and force-close) and I'm rather impatiently waiting for TRUC transactions to be able to get rid of update_fee entirely instead (and meanwhile have implementations do the work to accommodate these edge cases).

ziggie1984 commented 1 month ago

What do you think about this specification @t-bast:

  1. Peer paying the Channel-Fees:

    • If the new state is a concurrent case and the peer is also receiving an incoming hltlc (or more than 1), the peer paying the fees is allowed unconditionally to be dipped into his reserve (concurrent case also includes the fee_update)
    • Even if it’s not a concurrent case the peer paying the fees is allowed to be dipped into its reserve if there is at least 1 incoming htlc from his perspective (maybe we should limit this to 1) => Rationale could help unstucking some channels which are currently stuck and also in case of a splice out.
  2. Peer NOT paying the fees

    • If it's an concurrent case, we allow the peer to dip into its reserve.
    • If it's not a concurrent case we only allow 1 outgoing HTLC added to the new state, this can dip the peer below its reserve

Rule after the peer paying the fees is already below its reserve: Once the Peer paying the fees is below its reserve we do NOT allow it to be dipped even further in the next state.

I think this covers all the case of yours and is more universal.

t-bast commented 1 month ago

If the new state is a concurrent case

I'm not sure what you mean by concurrent case, and I'm afraid this is hard to properly define. I don't see why adding this distinction would help anyway? Better keep it simple?

maybe we should limit this to 1

This is entirely a policy of the HTLC sender: they control how many HTLCs they send in a batch, and can decide to limit that whenever the other node is paying the fees and is close to their reserve. This should be a recommendation (SHOULD) in the spec IMO.

Once the Peer paying the fees is below its reserve we do NOT allow it to be dipped even further in the next state.

That doesn't work with splicing. That's why I prefer the more general rule: the peer paying the on-chain fees is allowed to dip into its reserve for incoming HTLCs. How many HTLCs are added is entirely decided by the other peer, so they can rate-limit it to limit their exposure.

ziggie1984 commented 1 month ago

That doesn't work with splicing. That's why I prefer the more general rule: the peer paying the on-chain fees is allowed to dip into its reserve for incoming HTLCs. How many HTLCs are added is entirely decided by the other peer, so they can rate-limit it to limit their exposure.

could you explain the case when we would even go further down the reserve ?

t-bast commented 1 month ago

If we create a splice transaction where the node who's not paying the commit fees adds a lot of funds, the node who's paying the commit fees may end up below the new reserve (since the channel capacity is increased, the reserve requirements increase as well).

In that case, we must allow sending HTLCs to the node who is paying the commit fees, which temporarily makes them dip further down their reserve (for the added weight of the HTLC output). If the HTLC is then failed, we just go back to the previous state and nothing has changed (in terms of reserve requirements). If the HTLC is fulfilled, then the node paying the commit fees increased their balance. This way we slowly make our way towards meeting the reserve requirements, exactly like what happens for a newly created channel where the reserve isn't initially met.

ziggie1984 commented 1 month ago

@t-bast thank you for the explanation:

Could you maybe take a look at https://github.com/lightningnetwork/lnd/pull/9100#issue-2524353879, and see whether LND and Eclair would be on the same page ?