ElementsProject / lightning

Core Lightning — Lightning Network implementation focusing on spec compliance and performance
Other
2.84k stars 901 forks source link

routing: cln picking channel that cannot route the htlc #7160

Closed JssDWt closed 7 months ago

JssDWt commented 7 months ago

Issue and Steps to Reproduce

getinfo output

{
   "id": "02442d4249f9a93464aaf8cd8d522faa869356707b5f1537a8d6def2af50058c5b",
   "alias": "BreezR",
   "color": "ae34f5",
   "num_peers": 23,
   "num_pending_channels": 0,
   "num_active_channels": 31,
   "num_inactive_channels": 0,
   "version": "v24.02.1",
   "blockheight": 835345,
   "network": "bitcoin",
   "our_features": {
      "init": "08a0000a8a5961",
      "node": "88a0000a8a5961",
      "channel": "",
      "invoice": "02000002024100"
   }
}

logs

What you see here is a htlc coming in over channel A with peer 03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f with htlc_id=514. It is then forwarded to peer 0264a62a4307d701c04a46994ce5f5323b1ca28c80c66b73c631dbcb0990d6e835. You see it's picking a 'better' channel for the route than the one requested. but this better channel is unable to route the htlc. So it should have picked yet another one.

2024-03-19T10:15:38.874Z DEBUG   03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f-chan#26: Looking up channel by scid=828831x1096x0 to forward htlc_id=514
2024-03-19T10:15:38.874Z DEBUG   03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f-chan#26: Chose a better channel than 828831x1096x0: 828159x1405x0
2024-03-19T10:15:38.874Z DEBUG   03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f-chan#26: Decided to forward htlc_id=514 over channel with scid=828159x1405x0 with peer 0264a62a4307d701c04a46994ce5f5323b1ca28c80c66b73c631dbcb0990d6e835
2024-03-19T10:15:38.874Z DEBUG   lightningd: Calling htlc_accepted hook of plugin keysend
2024-03-19T10:15:38.879Z DEBUG   03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f-channeld-chan#26: ... , awaiting 1122
2024-03-19T10:15:38.880Z DEBUG   lightningd: Plugin keysend returned from htlc_accepted hook call
2024-03-19T10:15:38.880Z DEBUG   03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f-channeld-chan#26: Got it!
2024-03-19T10:15:38.880Z DEBUG   03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f-channeld-chan#26: revoke_and_ack LOCAL: remote_per_commit = 02b8d5a93046e7bc62b60edd237f85be4bb593e8536dfd556d449732b314f48443, old_remote_per_commit = 03760796eb26ee33faf1354be83795f28d661db994f72f4e379f4f8783bf6ea15c
2024-03-19T10:15:38.880Z DEBUG   0264a62a4307d701c04a46994ce5f5323b1ca28c80c66b73c631dbcb0990d6e835-channeld-chan#10: LOCAL cannot afford htlc: would make balance 79891856msat below reserve 167772sat
2024-03-19T10:15:38.880Z DEBUG   0264a62a4307d701c04a46994ce5f5323b1ca28c80c66b73c631dbcb0990d6e835-channeld-chan#10: Adding HTLC 261 amount=88105279msat cltv=835624 gave CHANNEL_ERR_CHANNEL_CAPACITY_EXCEEDED
kingonly commented 7 months ago

From @cdecker: "Here's the selection code in v24.02: https://github.com/ElementsProject/lightning/blob/9b83b8b9674712bdb8af2d3c4e158eee1bf47b8f/lightningd/pay.c#L804-L855

This is the place where we can determine if a channel has sufficient capacity.

You could blacklist the one scid in that code if it's a one off, or you can add your selection criteria yourself."

cdecker commented 7 months ago

Ok, found an easy way to disable the switch of the channel: have different feerates on them, or different base-fees.

The reason is we only switch if they both match:

https://github.com/ElementsProject/lightning/blob/9b83b8b9674712bdb8af2d3c4e158eee1bf47b8f/lightningd/peer_htlcs.c#L665-L698

Notice in particular these lines:

https://github.com/ElementsProject/lightning/blob/9b83b8b9674712bdb8af2d3c4e158eee1bf47b8f/lightningd/peer_htlcs.c#L683-L687

Causing best to be hint at the end if no other channel has the same fees.

cdecker commented 7 months ago

The error appears to hint at the fact that channel_amount_spendable() appears to not account correctly for reserve amounts, causing

https://github.com/ElementsProject/lightning/blob/9b83b8b9674712bdb8af2d3c4e158eee1bf47b8f/lightningd/peer_htlcs.c#L679-L680

to not discard a channel that is too small when trying. I assume by the way that this is a regular occurrence since due to the concurrency and possibility for balance changes between a channel beign picked and it being used, which may always cause this.

JssDWt commented 7 months ago

Ok, found an easy way to disable the switch of the channel: have different feerates on them, or different base-fees.

I don't think that would help with the issue, because quite a few failed forwards were actually using the channel in question initially.

I assume by the way that this is a regular occurrence since due to the concurrency and possibility for balance changes between a channel beign picked and it being used, which may always cause this.

It's definitely a regular occurrence. In the last 10 hours it affected approximately 93% of forwards.

JssDWt commented 7 months ago

Note that spendable_msat is 225135, so it's very close to being depleted.

JssDWt commented 7 months ago

This issue is solved.

The issue was a difference in fee policies between two groups of channels. There were 3 depleted channels and they had a slightly different fee policy than the other channels that did have enough liquidity. So if the sender decided to send over channel A, which was depleted, the best_channel function would only consider other channels with the same fee policy. But those channels were depleted too. So it would pick the 'best' depleted one, which was consistently the same channel, aka the one with 225 sat to spend.

Aligning the fees among all channels with the same peer solved the problem. If the sender now selects a random channel, the htlc is always forwarded over the channel with the largest spendable balance, as is expected.