Closed wtogami closed 1 year ago
NOPE my interpretation was wrong. It happened again immediately in the next swap attempt. Maybe we never tested it with 192 on both sides before? Reverting 192 for now. Needs further investigation.
Interesting. This time I waited for 2+ confirmations then restarted peerswap on the local node prior to the 50% invoice expiration time. It recognized the onchain confirmed transaction and completed the swap immediately.
2023-06-18T02:44:35.942Z INFO plugin-peerswap: [Swap:SWAPID]: Recovering from State_SwapOutSender_AwaitTxConfirmation
2023-06-18T02:44:35.968Z DEBUG plugin-peerswap: Await confirmation for tx with id: ecb3b0b9cf97bfe591015ffc237dc842761242cd5c8b33428d38f72a54bf6c7f on swap SWAPID
2023-06-18T02:44:35.969Z DEBUG plugin-peerswap: [TxWatcher] SWAPID wait for confirmation, current state: confirmed_below
2023-06-18T02:44:36.040Z INFO plugin-peerswap: peerswap initialized
2023-06-18T02:44:36.099Z DEBUG plugin-peerswap: [FSM] event:id: SWAPID, Event_OnTxConfirmed on State_SwapOutSender_AwaitTxConfirmation
2023-06-18T02:44:42.370Z DEBUG plugin-peerswap: [Messenger] From: PEERID got msgtype: a45d payload: {\"swap_id\":\"SWAPID\",\"payreq\":\"BOLT11BALANCE\",\"tx_id\":\"ecb3b0b9cf97bfe591015ffc237dc842761242cd5c8b33428d38f72a54bf6c7f\",\"script_out\":1,\"blinding_key\":\"BLINDING_KEY\"}
2023-06-18T02:44:46.167Z INFO lightningd: Sending 1500000000msat over 1 hops to deliver 1500000000msat
2023-06-18T02:44:49.592Z DEBUG plugin-peerswap: [FSM] event:id: SWAPID, Event_ActionSucceeded on State_SwapOutSender_ValidateTxAndPayClaimInvoice
2023-06-18T02:44:50.122Z DEBUG plugin-peerswap: [FSM] event:id: SWAPID, Event_ActionSucceeded on State_SwapOutSender_ClaimSwap
I guess I got what is happening here.
We use gettxout
to check if the tx is already confirmed or if the tx passed the csv limit. We handled gettxout
as if the tx output was already spent which is true but the return is also empty if we don't have the tx in our mempool nor on the blockchain.
I have a patch ready that is less strict on empty gettxout
returns and only returns an ErrIsSpent
if we try to recover from a swap. On "normal" operations we ignore the empty return, just log it and keep watching for the tx .
Maybe we want to have a timestamp in the db when the swap started and be forgiving if the time t
is t < timestamp + x
and be strict if the time is t >= timestamp + x
This way we would not even have to tell the tx watcher if we are recovering and could also recover from a restart shortly after we broadcasted the opening tx.
We figured out what happened here. #192 intended to add more safety checks but we ended up adding a bug. This isn't a high priority so we will return to this a bit later.
TODO: testmempoolaccept
augmented gettxout
would make the txwatcher more precise.
tl;dr: This is a low priority.
We thought we had a problem in the way we currently check for spent UTXO. #192 attempted to make our check stricter but in the process broke it because the tester uses the same mempool for sender and receiver (does not match reality). Potential improvement for later would be to add a watch-only wallet to bitcoind so we can more precisely track the quantity of confirmations after the onchain tx disappears from the mempool. But ultimately we found the current method we use actually works well enough.
Closing for now
fsm config invalid
These logs below are from a node requesting a lbtc swap of 1 million sats from remote node. Everything but the Liquid txid is redacted.
Both nodes are running CLN v23.05.1 and PeerSwap #192.
Not sure what happened. The state machine is confused.
It looks like the requesting local node paid the swap-out BOLT11 invoice then the onchain transaction confirmed so fast that PeerSwap was unable to see it was unpaid prior to confirmation. This might be a rare race condition that confused the state machine?Local state 40 minutes later. It's stuck forever like this even beyond the expiration time.
The other node is stuck in state
State_SwapOutReceiver_AwaitClaimInvoicePayment
.Later I thought maybe if i restart the local node's peerswap it might recover and recognize the tx was confirmed and complete the swap.
I was too late. At least it successfully recovered.
Questions
If the onchain transaction confirms too soon it confuses the state machine, it gets stuck unable to see the confirmation transaction.