fix prepayment probing by sequencing subsequent payments

johncantrell97 commented 9 months ago

This replaces the htlcs: Vec<InterceptedHTLC> that is stored in the OutboundJITChannelState with a map that collects htlcs grouped by PaymentHash payments: HashMap<PaymentHash, Vec<InterceptedHTLC>>. It also introduces a queue of paymentspayment_queue: VecDeque`. We only store the PaymentHash in the queue and can lookup the htlcs that make up the payments in the map.

When we first transition from AwaitingPayment to PendingChannelOpen we create the queue and make sure the payment hash of the htlc that triggered the transition is in the front of the queue.

When we reach ChannelReady we can pop the first payment off the queue and proceed with our normal logic of forwarding all of the HTLCs that make up that payment.

At any point in the flow if extra HTLCs come in, they will get added to the map and queue in the order they arrived.

We introduce a new public function htlc_handling_failed where the user can pass through the failed_next_destination from HTLCHandlingFailed events. When we receive an HTLCDestination::InvalidForward we can lookup the OutboundJITChannel via the requested_forward_scid property.

We pop the next payment off the queue and attempt to forward it the same way we forward during ChannelReady.

It feels like there's still some issues here around subsequent payments that are different sizes. This assumes the initial payment and subsequent ones are all the same size and therefore the same opening fee is attempted to be extracted. This will work in the prepayment probe case we are solving for but not the general case.

To solve the general case I think we will need to store and/or recalculate both the opening fee and total amount to be forwarded based on the collected htlcs for the next payment.

I guess we also need to handle PaymentForwarded events and maintain a map from ChannelId to scid so that we can forward the rest of the payment queue once we have forwarded a payment successfully.

johncantrell97 commented 8 months ago

Note: Just realized there's a timing issue if we are in ChannelReady state and an HTLC fails (or succeeds for that matter) before the next payment/htlc arrives then there is nothing to trigger the subsequent forward.

We probably need some concept of or flag for "is_pending_payment" and immediately forward if there is no currently pending payment.

tnull commented 8 months ago

Note: Just realized there's a timing issue if we are in ChannelReady state and an HTLC fails (or succeeds for that matter) before the next payment/htlc arrives then there is nothing to trigger the subsequent forward.

We probably need some concept of or flag for "is_pending_payment" and immediately forward if there is no currently pending payment.

Not sure I understand the issue exactly. IIUC we should be fine if we default to forwarding the HTLC in htlc_intercepted, except when we're in AwaitingPayment state, in which case we'd follow the current "initial forward" path, collecting MPPs, etc?

wvanlint commented 8 months ago

Will be picking this up while John is away, I added a revert to maintain history and will squash/rebase the changes at a later point. I separated the PaymentQueue refactoring as a first commit maintaining the original behavior, and will add handling around htlc_handling_failed and payment_forwarded in following commits.

Not sure I understand the issue exactly. IIUC we should be fine if we default to forwarding the HTLC in htlc_intercepted, except when we're in AwaitingPayment state, in which case we'd follow the current "initial forward" path, collecting MPPs, etc?

I believe the issue is that we need to collect the JIT channel fee exactly one time with a successful payment. If the first collected payment fails (e.g. due to being a probe), we will need to deduct fees from a second payment which might arrive later in parts.

tnull commented 8 months ago

Will be picking this up while John is away, I added a revert to maintain history and will squash/rebase the changes at a later point. I separated the PaymentQueue refactoring as a first commit maintaining the original behavior, and will add handling around htlc_handling_failed and payment_forwarded in following commits.

Alright, sounds good. Will probably only get around next week to take a closer look/do a full pass. However, in the meantime this will need a rebase after #97 landed, i.e., {min,max}_payment_size_msat are now part of OpeningFeeParams and hence are per-payment limits. FWIW, we could just hold the full opening fee params rather than passing the individual fields around.

I believe the issue is that we need to collect the JIT channel fee exactly one time with a successful payment. If the first collected payment fails (e.g. due to being a probe), we will need to deduct fees from a second payment which might arrive later in parts.

Yes, that's the general issue here, I'm however still not fully getting why @johncantrell97 suggested we need a is_pending_payment flag and where/how it would be used.

johncantrell97 commented 8 months ago

I think the issue I'm getting at comes down to how the sequencer works. The idea is that there are only the following triggers that will attempt to forward the next payment from the queue:

1) transition to ChannelReady state when we first open the channel.

2) htlc handling failed, when a forward fails.

3) payment forwarded, when a forward succeeds.

If we are already in the ChannelReady state and have already successfully (or unsuccessfully) forwarded the initial payment and a subsequent payment comes in.

When it is intercepted it will be added to the queue and we will wait for one of the three triggers above to pop a payment off the queue but none of the three will happen.

There needs to be a way to know that there's no pending payment and therefore we don't need to queue the payment but we can forward it immediately.

wvanlint commented 8 months ago

There needs to be a way to know that there's no pending payment and therefore we don't need to queue the payment but we can forward it immediately.

In addition, I believe we also need to verify that the queued payment is at that point sufficient to pay the fee.

I made the following changes:

Fixed proportional fees for MPP in https://github.com/lightningdevkit/lightning-liquidity/pull/105.
Extracted HTLC queueing logic into PaymentQueue in a separate commit with the original behavior.
Generalized HTLC queueing logic:
- The HTLCIntercepted event can trigger the following actions:
- Opening of a channel if sufficient HTLCs are accumulated.
- Forwarding of all HTLCs belonging to the corresponding payment if sufficient HTLCs are accumulated (and a previous payment failed).
- Forwarding of the HTLC if the fee is already paid.
- The HTLCHandlingFailed event will fail a pending payment, and forward a different queued payment sufficient to pay the fee if possible.
- The PaymentForwarded event will forward any remaining queued payments, and disable future queueing. I was initially planning to use skimmed_fee_msat to verify that the skimmed fee is sufficient but that will only be available in LDK v0.0.112.

Adding additional unit test coverage to PaymentQueue and OutboundJITChannelState now.

tnull commented 8 months ago

Needs a rebase now that #105 landed.

wvanlint commented 8 months ago

Also, we previously removed additional tracking state where possible, and should consider ways to avoid re-introducing it if possible. e.g., the intercept-id-by-channel-id tracking seems only necessary since LDK doesn't provide the user_channel_id in the PaymentForwarded event? I think we should just add that upstream rather than adding additional HashMaps which always also introduce additional risk of getting stale.

I agree that this can probably be upstreamed in a subsequent version of LDK. In the meantime, I think it would be worthwhile to add this and revert it after the upgrade.

wvanlint commented 8 months ago

I'm still not convinced we need a separate PendingPayment state. Isn't it identical to PendingPaymentForward and any difference could be expressed via the returned ForwardAction being None/Some? Could you otherwise explain how they differ?

In both states, the channel is open and we have not received a successful confirmation of a payment forward yet. The main difference is that for PendingPaymentForward, we have a payment forward in progress, while for PendingPayment, any previous payments have failed.

If a different, complete payment comes in, we can't forward it while being in PendingPaymentForward. We don't know whether the pending forward will succeed or fail, so we don't know whether to skim a fee or not.

If such a payment comes in while being in PendingPayment, we know no other payments are pending and can skim the fee and forward the payment.

tnull commented 8 months ago

I'm still not convinced we need a separate PendingPayment state. Isn't it identical to PendingPaymentForward and any difference could be expressed via the returned ForwardAction being None/Some? Could you otherwise explain how they differ?

In both states, the channel is open and we have not received a successful confirmation of a payment forward yet. The main difference is that for PendingPaymentForward, we have a payment forward in progress, while for PendingPayment, any previous payments have failed.

If a different, complete payment comes in, we can't forward it while being in PendingPaymentForward. We don't know whether the pending forward will succeed or fail, so we don't know whether to skim a fee or not.

If such a payment comes in while being in PendingPayment, we know no other payments are pending and can skim the fee and forward the payment.

Aah, okay that makes sense. Do you mind adding comments to the variants in OutboundJITChannelState to describe their differences and any assumptions we make around them? That would help us going forward to be aware of the rationale of the chosen design without coming back here to investigate.

wvanlint commented 8 months ago

Added documentation around the different states on OutboundJITChannelState.

lightningdevkit / lightning-liquidity

fix prepayment probing by sequencing subsequent payments #102