ElementsProject / lightning

Core Lightning — Lightning Network implementation focusing on spec compliance and performance
Other
2.81k stars 887 forks source link

Connectivity issues between CLN and LND #5197

Open kristapsk opened 2 years ago

kristapsk commented 2 years ago

Issue and Steps to Reproduce

I have two Lightning mainnet nodes, one is running c-lightning v0.10.2, another is LND v0.14.2. Have channel initiated from CLN side, via Tor, to LND node. It was working fine for some week or so, was able to do successfull payment from CLN node to LND node. Today noticed it is in strange state, where lightning-cli getinfo shows it as active, LND shows it as inactive. Restarts of both nodes doesn't help much. lightning-cli ping [node-id] returns "code": -1, "message: "Peer bad state". Log has messages like this:

$ grep [node-id] /var/log/lightningd.log
...
2022-04-18T23:21:49.713Z DEBUG   plugin-funder: Cleaning up inflights for peer id [node-id]
2022-04-19T00:00:50.624Z INFO    [node-id]-channeld-chan#426: Peer connection lost
2022-04-19T00:00:50.624Z UNUSUAL [node-id]-channeld-chan#426: Status closed, but waitpid 17586 says No child processes
2022-04-19T00:00:50.624Z INFO    [node-id]-chan#426: Peer transient failure in CHANNELD_NORMAL: channeld: Owning subdaemon channeld died (-1)
2022-04-19T00:00:50.624Z DEBUG   plugin-funder: Cleaning up inflights for peer id [node-id]
2022-04-19T00:37:12.560Z UNUSUAL [node-id]-channeld-chan#426: Status closed, but waitpid 5235 says No child processes
2022-04-19T00:37:12.560Z INFO    [node-id]-chan#426: Peer transient failure in CHANNELD_NORMAL: channeld: Owning subdaemon channeld died (-1)
2022-04-19T00:37:12.561Z DEBUG   plugin-funder: Cleaning up inflights for peer id [node-id]
2022-04-19T00:47:18.329Z INFO    [node-id]-channeld-chan#426: Peer connection lost
2022-04-19T00:47:18.329Z UNUSUAL [node-id]-channeld-chan#426: Status closed, but waitpid 7331 says No child processes
2022-04-19T00:47:18.329Z INFO    [node-id]-chan#426: Peer transient failure in CHANNELD_NORMAL: channeld: Owning subdaemon channeld died (-1)
2022-04-19T00:47:18.330Z DEBUG   plugin-funder: Cleaning up inflights for peer id [node-id]
2022-04-19T02:21:19.033Z UNUSUAL [node-id]-channeld-chan#426: Status closed, but waitpid 10901 says No child processes
2022-04-19T02:21:19.033Z INFO    [node-id]-chan#426: Peer transient failure in CHANNELD_NORMAL: channeld: Owning subdaemon channeld died (-1)
2022-04-19T02:21:19.033Z DEBUG   plugin-funder: Cleaning up inflights for peer id [node-id]

Any hints how to debug this?

getinfo output

{
   "id": "***",
   "alias": "***",
   "color": "***",
   "num_peers": 7,
   "num_pending_channels": 0,
   "num_active_channels": 7,
   "num_inactive_channels": 0,
   "address": [
      {
         "type": "ipv4",
         "address": "nn.nn.nn.nn",
         "port": 9735
      },
      {
         "type": "torv3",
         "address": "blablabla.onion",
         "port": 9735
      }
   ],
   "binding": [],
   "version": "0.10.2-gentoo-r0",
   "blockheight": 732541,
   "network": "bitcoin",
   "msatoshi_fees_collected": NNN,
   "fees_collected_msat": "NNNmsat",
   "lightning-dir": "/var/lib/lightning/bitcoin"
}
rustyrussell commented 2 years ago

Seems like it's disconnecting, which happens with Tor. Not all the time though, from those timestamps. It will try to reconnect automatically when this happens. You can use 'listpeers ' to see whether it's connected at any point in time.

kristapsk commented 2 years ago

Right now it has connected and channel is active. But now other channels have the same issue in LND node side, all are created from other side, via Tor. Now looks really like a Tor connectivity issue to me, LND log has entries like this some hours ago:

2022-04-19 05:22:22.315 [INF] PEER: unable to read message from ***@127.0.0.1:37048: EOF
2022-04-19 05:22:22.315 [INF] PEER: disconnecting ***@127.0.0.1:37048, reason: read handler closed

It runs older Tor version there (0.4.2.7), will try to upgrade and see does that solve the problem.