Open zoedberg opened 4 months ago
FWIW, we saw similar flaky behavior in LDK Node's integration tests.
I might be wrong, but I suspect this could be a race with ChannelMonitor
persistence that might have been introduced as a byproduct of #2957, as we no longer wait for persistence to finish before issuing the payment events? (cc @G8XSU)
This is strange because before closing the channel, during the keysend, we wait for the payment to become successful (which happens during the PaymentSent and PaymentClaimed events), therefore we expect the channel to have no pending HTLCs and the node to broadcast the last commiment TX.
This is not a guarantee - PaymentSent
indicates that we received the preimage back for our payment (i.e. might happen before we even see a new commitment transaction) and PaymentClaimed
guarantee sthat the preimage is durably on disk (which might happen before we have a new commitment transaction). We try to get users these events once we're sure things are irrevocable, but that might happen on-chain rather than in-channel.
I see a call to provide_latest_holder_tx...
Hmm, this event order does somewhat surprise me, however. Are you sure you're seeing this on the same node's ChannelMonitor
? Can you share the logs (and, to confirm, this is a fork based on LDK 0.0.123)?
I might be wrong, but I suspect this could be a race with ChannelMonitor persistence that might have been introduced as a byproduct of https://github.com/lightningdevkit/rust-lightning/pull/2957, as we no longer wait for persistence to finish before issuing the payment events? (cc @G8XSU)
That shouldn't impact this. That change only impacts persists based on block-connection, but here we're (I think?) just talking about persists due to channel state changes, and specifically when the ChannelManager
generates events wrt those persists.
This is not a guarantee - PaymentSent indicates that we received the preimage back for our payment (i.e. might happen before we even see a new commitment transaction) and PaymentClaimed guarantee sthat the preimage is durably on disk (which might happen before we have a new commitment transaction). We try to get users these events once we're sure things are irrevocable, but that might happen on-chain rather than in-channel.
Got it, thanks for the explanation. So I guess in our tests we should keep a small sleep to be sure the force close operation will always broadcast the latest commitment TX.
Hmm, this event order does somewhat surprise me, however. Are you sure you're seeing this on the same node's ChannelMonitor? Can you share the logs (and, to confirm, this is a fork based on LDK 0.0.123)?
I confirm the fork is based on LDK 0.0.123, but nice catch, I wasn't checking which node was printing the change to self.holder_commitment
and after rerunning the test with the node information I can now say there is no strange behavior, the node correctly broadcasts the last TX assigned to self.holder_commitment
. Sorry for my incorrect reporting. I guess this issue can be closed.
I'm using RLN, which uses a forked version of rust-lightning, where I've added support for RGB. I'm running an integration test (https://github.com/RGB-Tools/rgb-lightning-node/blob/master/src/test/close_force_standard.rs) where I open a channel between 2 nodes, make 2 payments via keysend and force close the channel.
After the channel closure I sometimes see a behavior (non-deterministic, but quite frequent) where the
BumpTransactionEventHandler
tries to bump the fees for an HTLC TX (we are using anchor outputs). This is strange because before closing the channel, during the keysend, we wait for the payment to become successful (which happens during thePaymentSent
andPaymentClaimed
events), therefore we expect the channel to have no pending HTLCs and the node to broadcast the last commiment TX.I'm writing here first of all to understand if you think this is expected behavior or not. That is, waiting for the mentioned events should guarantee that the
OnchainTxHandler
will broadcast the last valid commitment TX?I've also collected some logs, hoping this could help better understand the issue:
provide_latest_holder_tx
that sets theholder_commitment
field to the latest commitment TX (the one we expect to be broadcasted, with no HTLCs)channel_manager.force_close_broadcasting_latest_txn
happens after theholder_commitment
is set to the expected commitment TXself.holder_commitment
inget_maybe_signed_holder_tx
I see the TX is not the expected one but it's the previous commitment TX (the one with 1 HTLC) insteadI don't think that our changes to support RGB could cause this issue. If you think this part is well-tested on your side I'll investigate more and maybe try to reproduce it on a vanilla node.