ElementsProject / lightning

Core Lightning — Lightning Network implementation focusing on spec compliance and performance
Other
2.85k stars 902 forks source link

Large payments with unexpected status/amount #4753

Open shesek opened 3 years ago

shesek commented 3 years ago

When I try to send a payment that's too big for all of my channels, pay fails immediately with 'Ran out of routes to try', yet the payment continues to show up as 'pending' in listpays.

In addition, it appears that the amount returned from listpays in amount_msat is incorrect. Instead of the invoice amount, it returns a lower amount that matches the amount_sent_msat field.

Here's an example:

$ ln1 invoice 0.1btc z z | jq -r .bolt11
lnbcrt100m1psj7q2mpp5x9drvhcytz8xcd4yr65wvzcxdl8d440jv2ln5v23jkhv9atdls3sdqz0gxqyjw5qcqp2sp5s72h2gzlcq7a87fjll2vefa2x57l2nmvwfwhudl9sjdurqy55qqq9qgsqqqyssq9szlvp9y96ylzm47xwxf3mg5l4v4d0nq942pre2u74udj088ysvq5gmqmgqjl29jqh7ljrjca05vggpvjklylzxc7grat0sczl7pqngqrj5psz

$ ln2 pay lnbcrt100m<...>
# Fails immediately with 'Ran out of routes to try after 26 attempts: see `paystatus`'

$ ln2 listpays bolt11=lnbcrt100m<...>
{  
   "pays": [
      {  
         "bolt11": "lnbcrt100m1psj7q2mpp5x9drvhcytz8xcd4yr65wvzcxdl8d440jv2ln5v23jkhv9atdls3sdqz0gxqyjw5qcqp2sp5s72h2gzlcq7a87fjll2vefa2x57l2nmvwfwhudl9sjdurqy55qqq9qgsqqqyssq9szlvp9y96ylzm47xwxf3mg5l4v4d0nq942pre2u74udj088ysvq5gmqmgqjl29jqh7ljrjca05vggpvjklylzxc7grat0sczl7pqngqrj5psz",
         "destination": "027f2f3027a18e93502032f69596be33ebde2c975eb3c95dac86a0322b084b80d8",
         "payment_hash": "315a365f04588e6c36a41ea8e60b066fcedad5f262bf3a315195aec2f56dfc23",
         "status": "pending",
         "created_at": 1630470540,
         "amount_msat": "4245510750msat",
         "amount_sent_msat": "4245510750msat",
         "number_of_parts": 3
      }
   ]
}
cdecker commented 3 years ago

This is likely caused by the adaptive splitter splitting the payments into smaller parts if we can't find a route for the full amount. A quick test of this shows exactly this:

lightningd-1: 2021-09-06T12:21:19.785Z DEBUG   plugin-pay: cmd 56 partid 0: Added a channel hint for 103x1x1/1: enabled true, estimated capacity 76320000msat
lightningd-1: 2021-09-06T12:21:19.787Z DEBUG   plugin-pay: cmd 56 partid 0: After filtering routehints we're left with 0 usable hints
lightningd-1: 2021-09-06T12:21:19.787Z DEBUG   plugin-pay: cmd 56 partid 0: Not using a routehint
lightningd-1: 2021-09-06T12:21:19.787Z DEBUG   plugin-pay: cmd 56 partid 0: The destination is directly reachable including attempts without routehints
lightningd-1: 2021-09-06T12:21:19.790Z INFO    plugin-pay: cmd 56 partid 0: Initial limit on max HTLCs: 15, Destination 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59 has 1 channels, assuming 15 HTLCs per channel
lightningd-1: 2021-09-06T12:21:19.790Z INFO    plugin-pay: cmd 56 partid 1: Number of pre-split HTLCs (10) exceeds our HTLC budget (5), skipping pre-splitter
lightningd-1: 2021-09-06T12:21:19.790Z INFO    plugin-pay: cmd 56 partid 1: No path found
lightningd-1: 2021-09-06T12:21:19.790Z DEBUG   plugin-pay: cmd 56 partid 1: Adaptively split into 2 sub-payments: new partid 2 (46578927msat), new partid 3 (53421074msat)
lightningd-1: 2021-09-06T12:21:19.792Z DEBUG   plugin-pay: cmd 56 partid 2: Found a direct channel (103x1x1/1) with sufficient capacity, skipping route computation.
lightningd-1: 2021-09-06T12:21:19.792Z DEBUG   plugin-pay: cmd 56 partid 2: Created outgoing onion for route: 103x1x1
lightningd-2: 2021-09-06T12:21:19.796Z DEBUG   0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#1: peer_in WIRE_UPDATE_ADD_HTLC
lightningd-2: 2021-09-06T12:21:19.797Z DEBUG   0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#1: NEW:: HTLC REMOTE 0 = RCVD_ADD_HTLC/SENT_ADD_HTLC
lightningd-1: 2021-09-06T12:21:19.797Z DEBUG   022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: NEW:: HTLC LOCAL 0 = SENT_ADD_HTLC/RCVD_ADD_HTLC
lightningd-1: 2021-09-06T12:21:19.797Z DEBUG   plugin-pay: cmd 56 partid 3: Adding shadow_route hop over channel 103x1x1: adding 535msat in fees and 6 CLTV delta
lightningd-1: 2021-09-06T12:21:19.797Z DEBUG   022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: Adding HTLC 0 amount=46578927msat cltv=114 gave CHANNEL_ERR_ADD_OK
lightningd-1: 2021-09-06T12:21:19.800Z DEBUG   022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: peer_out WIRE_UPDATE_ADD_HTLC
lightningd-1: 2021-09-06T12:21:19.800Z DEBUG   022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: REPLY WIRE_CHANNELD_OFFER_HTLC_REPLY with 0 fds
lightningd-1: 2021-09-06T12:21:19.802Z DEBUG   plugin-pay: cmd 56 partid 3: Adding shadow_route hop over channel 103x1x1: adding 535msat in fees and 6 CLTV delta
lightningd-1: 2021-09-06T12:21:19.802Z DEBUG   plugin-pay: cmd 56 partid 3: Not using a routehint
lightningd-1: 2021-09-06T12:21:19.802Z INFO    plugin-pay: cmd 56 partid 3: No path found
lightningd-1: 2021-09-06T12:21:19.802Z DEBUG   plugin-pay: cmd 56 partid 3: Adaptively split into 2 sub-payments: new partid 4 (28995782msat), new partid 5 (24425292msat)
lightningd-1: 2021-09-06T12:21:19.805Z DEBUG   plugin-pay: cmd 56 partid 4: Found a direct channel (103x1x1/1) with sufficient capacity, skipping route computation.
lightningd-1: 2021-09-06T12:21:19.805Z DEBUG   plugin-pay: cmd 56 partid 4: Created outgoing onion for route: 103x1x1
lightningd-1: 2021-09-06T12:21:19.807Z DEBUG   022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: We need 15916sat at feerate 11000 for 1 untrimmed htlcs: we have 100000000msat/100000000msat

As you can see it correctly identifies that there is no route for the full amount, then it splits into two, suddenly it can get at least one part out. The second one will hit the local limit, causing further splits, and delaying the final decision slightly. Ultimately in this case we will get the first HTLC and some tiny ones through filling our local channels, but unable to complete.

The recipient will fail the payment with WIRE_MPP_TIMEOUT eventually. The status here is because not all parts are returned yet, which requires the final MPP timeout from the recipient.

Not sure how to address this, since this is the same scenario that can also happen further removed from the node, and is handled correctly as is (just a display issue here imho). Of course we could special case this as a preflight check, summing up across all our channel's spendable amount and erring out early, but that's adding special handling for something that can occur naturally in the wider network as well.

shesek commented 3 years ago

I see. Will try to find a way to work around this on my side, maybe mark the payment as failed in the UI when pay returns with an error and then ignore what listpays says?

What about the amount?

cdecker commented 3 years ago

The payment should definitely be marked as failed, because it has a fatal failure in the form of an MPP_TIMEOUT, and the rest are just HTLCs we haven't gotten back yet, but they'll fail as well.

The amounts of the in-flight parts will always sum up to at most the value that was attempted to send. We could associate the invoice with the pay plugin in the datastore, allowing us to make better decisions in the pay plugin (similar to what you'd be doing on the frontend).