Closed rustyrussell closed 1 year ago
** mutual_splice_locked **
WIRE_ANNOUNCEMENT_SIGNATURES <- l2
l1 -> WIRE_ANNOUNCEMENT_SIGNATURES
l1 <- WIRE_ANNOUNCEMENT_SIGNATURES [A]
WIRE_ANNOUNCEMENT_SIGNATURES -> l2
l1 -> WIRE_ANNOUNCEMENT_SIGNATURES
l1 <- WIRE_ANNOUNCEMENT_SIGNATURES [B]
WIRE_ANNOUNCEMENT_SIGNATURES <- l2
WIRE_ANNOUNCEMENT_SIGNATURES -> l2
l1 is getting an extra announcement signatures that was never sent by l2 in these logs. I'm guessing [B]
is actually the valid one and [A]
is the stale one. It almost looks like it's a queued up announcement_sigs
that got sent earlier on, maybe the bug is announcement sigs doing something weird with STFU
mode 🤔.
Can you reference the full logs? Would love to see when this extra WIRE_ANNOUNCEMENT_SIGNATURES
is sent 👀.
Ah I found something interesting! There is a random send_channel_update 0
and Trying commit
in the middle of the splice where there shouldn't be anything like that.
lightningd-2 2023-08-12T01:00:07.464Z INFO 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#1: Splice signing tx: 02000000028701da09f88447e4bad1ac3fae3cde9a8520a75c3a735e4b6edecf4dad79713e0000000000000000008701da09f88447e4bad1ac3fae3cde9a8520a75c3a735e4b6edecf4dad79713e0100000000fdffffff02e0c81000000000002200205b8cd3b914cf67cdd8fa6273c930353dd36476734fbd962102c2df53b90880cd4e7c0d00000000002251207836355fdc8a82dc4cb00a772c5554151d06384a4dd65e8d3f68ac08566b84be6c000000 [2]
lightningd-2 2023-08-12T01:00:07.465Z DEBUG hsmd: Client: Received message 29 from client [2]
lightningd-2 2023-08-12T01:00:07.466Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#1: Splice: we sign first [2]
lightningd-1 2023-08-12T01:00:07.475Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: Received revoke_and_ack [1]
lightningd-1 2023-08-12T01:00:07.475Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: No commits outstanding after recv revoke_and_ack [1]
lightningd-1 2023-08-12T01:00:07.492Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: Sending master 1022 [1]
lightningd-1 2023-08-12T01:00:07.496Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: got revoke 0: 0 changed [1]
lightningd-2 2023-08-12T01:00:07.507Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#1: peer_out WIRE_TX_SIGNATURES [2]
lightningd-1 2023-08-12T01:00:07.581Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: ... , awaiting 1122 [1]
lightningd-1 2023-08-12T01:00:07.586Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: Got it! [1]
lightningd-1 2023-08-12T01:00:07.587Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: revoke_and_ack LOCAL: remote_per_commit = 02d2a7828d7be845ad8e5d1cdce493d3932f509528c309b59e8414077d16c26462, old_remote_per_commit = 020b2bb4220acc4114c47375e9b71e072f7cac127e1bb453d49595699958cfa9e4 [1]
lightningd-1 2023-08-12T01:00:07.701Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: user_finalized peer->stfu_wait_single_msg: 0 [1]
lightningd-1 2023-08-12T01:00:07.723Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: send_channel_update 0 [1] <---- What's this doing here?
lightningd-1 2023-08-12T01:00:07.730Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: Trying commit [1] <---- What's this doing here?
lightningd-1 2023-08-12T01:00:07.731Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: Can't send commit: nothing to send, feechange not wanted ({ SENT_ADD_ACK_REVOCATION:7500 }) blockheight not wanted ({ SENT_ADD_ACK_REVOCATION:0 }) [1]
lightningd-1 2023-08-12T01:00:08.106Z DEBUG hsmd: Client: Received message 7 from client [1]
lightningd-1 2023-08-12T01:00:08.113Z DEBUG lightningd: splice_signed input PSBT version 2 [1]
lightningd-1 2023-08-12T01:00:08.201Z INFO 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: Splice signing tx: 02000000028701da09f88447e4bad1ac3fae3cde9a8520a75c3a735e4b6edecf4dad79713e0000000000000000008701da09f88447e4bad1ac3fae3cde9a8520a75c3a735e4b6edecf4dad79713e0100000000fdffffff02e0c81000000000002200205b8cd3b914cf67cdd8fa6273c930353dd36476734fbd962102c2df53b90880cd4e7c0d00000000002251207836355fdc8a82dc4cb00a772c5554151d06384a4dd65e8d3f68ac08566b84be6c000000 [1]
lightningd-1 2023-08-12T01:00:08.262Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-hsmd: Got WIRE_HSMD_SIGN_SPLICE_TX [1]
lightningd-1 2023-08-12T01:00:08.270Z DEBUG hsmd: Client: Received message 29 from client [1]
lightningd-1 2023-08-12T01:00:08.273Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: peer_in WIRE_TX_SIGNATURES [1]
lightningd-1 2023-08-12T01:00:08.285Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: Left STFU mode. [1]
This is likely the reltimers randomly firing during the splice.
They still aren't triggering the peer_out WIRE_ANNOUNCEMENT_SIGNATURES
log message I'd expect to see, though this timing issue seems great to fix in any case.
This PR https://github.com/ElementsProject/lightning/pull/6554 disables the announcement and commitment timers during STFU makes a lot of sense to me in general. Not 100% convinced it will fix this issue without a little more details but I suspect it probably will.
It kind of fits this could happen when we're doing the events super fast in CI and have dev fast gossip enabled, more likely to hit timing issues like this.
Okay I solved the mystery of the phantom WIRE_ANNOUNCEMENT_SIGNATURES
message. It's just an artifact of the way the logs are created. Here are the relevant log lines with excess info trimmed and some notes added.
l2: DEBUG 026...518-channeld-chan#1: peer_out WIRE_ANNOUNCEMENT_SIGNATURES l2 -> 1 pending [A]
l1: DEBUG 022...d59-channeld-chan#1: peer_out WIRE_ANNOUNCEMENT_SIGNATURES l1 -> 1 pending [B]
l1: DEBUG 022...d59-channeld-chan#1: peer_in WIRE_ANNOUNCEMENT_SIGNATURES l1 <- 0 pending [C]
l2: DEBUG 026...518-channeld-chan#1: peer_in WIRE_ANNOUNCEMENT_SIGNATURES l2 <- 0 pending [D]
l1: DEBUG 022...d59-channeld-chan#1: peer_out WIRE_ANNOUNCEMENT_SIGNATURES l1 -> 1 pending [E]
l1: DEBUG 022...d59-channeld-chan#1: peer_in WIRE_ANNOUNCEMENT_SIGNATURES l1 <- -1 pending [F]
l2: DEBUG 026...518-channeld-chan#1: peer_out WIRE_ANNOUNCEMENT_SIGNATURES l2 -> 0 pending [G]
l2: DEBUG 026...518-channeld-chan#1: peer_in WIRE_ANNOUNCEMENT_SIGNATURES l2 <- 0 pending [H]
The G line actually happened before the F line, however l2's logger was running slightly behind l1, so it ended up on the line below.
lightningd-1 2023-08-12T01:00:10.641Z** DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: peer_in WIRE_ANNOUNCEMENT_SIGNATURES
lightningd-1 2023-08-12T01:00:10.642Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: billboard: Channel ready for use. Channel announced.
lightningd-1 2023-08-12T01:00:10.650Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-gossipd: local_channel_update for unknown 109x1x0
lightningd-2 2023-08-12T01:00:10.654Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#1: Exchanging announcement signatures.
lightningd-2 2023-08-12T01:00:10.655Z** DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#1: peer_out WIRE_ANNOUNCEMENT_SIGNATURES
Interestingly even the timestamps are incorrect
lightningd-1 2023-08-12T01:00:10.641Z .. peer_in WIRE_ANNOUNCEMENT_SIGNATURES
...
lightningd-2 2023-08-12T01:00:10.655Z .. peer_out WIRE_ANNOUNCEMENT_SIGNATURES
With that mystery solved and the error being invalid local_channel_announcement
, I am convinced PR #6554 fixes this.
Closing this as PR https://github.com/ElementsProject/lightning/pull/6554 was merged
This happened under CI. gossipd complained that the channel_announcement we fed it bad a bad signature, and indeed, I checked l2 signed it with the old scid
103x1x0
and l1 signed it with109x1x0
. l1 is complaining, so it obviously handed the old signature to gossipd.I can see l1 receiving 3 ANNOUNCEMENT_SIGNATURES from l2 after the splice: is l2 sending the old txid somehow?
This may be a weird corner case caused by all 6 blocks coming in at once: we go from mined to announceable immediately.