Open whitslack opened 3 weeks ago
Might be a red herring. The first peer was connected inbound to me via a Tor exit node. (I have no Tor service.) I disconnected the peer, and it has yet to reconnect, so maybe it went offline but the Tor exit node didn't. Still, I suspect something might not be plumbed up quite right if fundchannel_cancel
never notices that the peer isn't responding.
Nope. There is definitely a problem here. The peer cannot maintain a stable connection with me because I keep sending WIRE_ERROR
for the second channel and dropping the whole connection.
Here's what I see in the log repeatedly (every minute or so):
2024-11-02T15:54:31.710Z DEBUG 03…55-connectd: Connect IN
2024-11-02T15:54:31.711Z DEBUG 03…55-connectd: peer_out WIRE_INIT
2024-11-02T15:54:31.712Z DEBUG 03…55-connectd: peer_in WIRE_INIT
2024-11-02T15:54:31.712Z DEBUG 03…55-connectd: Handed peer, entering loop
2024-11-02T15:54:31.712Z DEBUG 03…55-chan#255536: Peer has reconnected, state CHANNELD_NORMAL: connecting subd
2024-11-02T15:54:31.734Z DEBUG 03…55-channeld-chan#255536: pid 14729, msgfd 188
2024-11-02T15:54:31.735Z DEBUG 03…55-chan#255536: Already have funding locked in
2024-11-02T15:54:31.735Z DEBUG 03…55-chan#255537: Peer has reconnected, state CHANNELD_AWAITING_LOCKIN: connecting subd
2024-11-02T15:54:31.735Z DEBUG 03…55-chan#255537: Telling connectd to send error 00…2e
2024-11-02T15:54:31.736Z DEBUG 03…55-connectd: peer_out WIRE_GOSSIP_TIMESTAMP_FILTER
2024-11-02T15:54:31.736Z DEBUG 03…55-gossipd: seeker: disabling gossip
2024-11-02T15:54:31.740Z DEBUG 03…55-connectd: peer_out WIRE_ERROR
2024-11-02T15:54:31.740Z DEBUG 03…55-connectd: discard_peer
2024-11-02T15:54:31.740Z DEBUG 03…55-lightningd: peer_disconnect_done
2024-11-02T15:54:31.740Z DEBUG 03…55-lightningd: Already reconnecting
2024-11-02T15:54:31.741Z DEBUG plugin-funder: Cleaning up inflights for peer id 03…55
2024-11-02T15:54:31.741Z DEBUG 03…55-channeld-chan#255536: option_static_remotekey = 1, option_anchor_outputs = 0 option_anchors_zero_fee_htlc_tx = 1
2024-11-02T15:54:31.741Z DEBUG 03…55-channeld-chan#255536: init LOCAL: remote_per_commit = 03…7e, old_remote_per_commit = 03…73 next_idx_local = 687 next_idx_remote = 688 revocations_received = 687 feerates { SENT_ADD_ACK_REVOCATION:1255 } range 253-12700 blockheights { SENT_ADD_ACK_REVOCATION:0 }, our current 868565
2024-11-02T15:54:31.742Z DEBUG 03…55-hsmd: Got WIRE_HSMD_GET_PER_COMMITMENT_POINT
2024-11-02T15:54:31.742Z DEBUG 03…55-hsmd: Got WIRE_HSMD_GET_PER_COMMITMENT_POINT
2024-11-02T15:54:31.742Z DEBUG 03…55-channeld-chan#255536: peer_out WIRE_CHANNEL_REESTABLISH
2024-11-02T15:54:31.743Z DEBUG 03…55-channeld-chan#255536: billboard: Sent reestablish, waiting for theirs
2024-11-02T15:54:36.737Z DEBUG 03…55-connectd: Subd did not close, forcing close
2024-11-02T15:54:36.737Z **BROKEN** 03…55-connectd: Peer did not close, forcing close
2024-11-02T15:54:36.737Z INFO 03…55-channeld-chan#255536: Peer connection lost
2024-11-02T15:54:36.737Z DEBUG 03…55-channeld-chan#255536: Status closed, but not exited. Killing
2024-11-02T15:54:36.737Z INFO 03…55-chan#255536: Peer transient failure in CHANNELD_NORMAL: channeld: Owning subdaemon channeld died (62208)
Note that there are two different channel IDs here. 255536 is the first channel, which is in CHANNELD_NORMAL
state. 255537 is the second channel, which is in CHANNELD_AWAITING_LOCKIN
state and would be indefinitely since its funding transaction got replaced (intentionally).
Why does connectd
disconnect the peer after sending WIRE_ERROR
when there is still a perfectly good channel running on the connection?
Epilogue: Interestingly, my long-hung fundchannel_cancel
finally returned when I gave up and called dev-forget-channel
. Unfortunately, I suspect this means that nobody ended up telling the peer to abandon the channel too.
{
"code": -1,
"message": "Channel structure was freed!"
}
Issue and Steps to Reproduce
CHANNELD_NORMAL
state with a peer.fundchannel_start
.fundchannel_complete
.fundchannel_start
.fundchannel_complete
.fundchannel_cancel
.fundchannel_cancel
never returns.getinfo
outputThis is with CLN 24.08.2.