lightningnetwork / lnd

Lightning Network Daemon ⚡️
MIT License
7.69k stars 2.08k forks source link

Lnd crashes after start #5141

Closed patrikmojzis closed 3 years ago

patrikmojzis commented 3 years ago

Background

Lnd v0.12.1 was working normally since the release but now I noticed that my channels were offline. I checked what's happening and I found out that something went wrong and lnd just keeps restarting. Tried to update to the latest commit but did not help.

I don't know what could cause it since I did not do any changes. I regularly check up on my node and everything was always fine.

Environment

{
    "version": "0.12.99-beta commit=v0.12.0-beta-257-g901c192e",
    "commit_hash": "901c192e9f7fff86c132d501123dccb4a308afca",
    "identity_pubkey": "02ce0da01c1e86dd80ef1888febda8905e0ac03290a48a77d5aa9ad574f072040a",
    "alias": "Tvoja Mama",
    "color": "#568ea3",
    "num_pending_channels": 0,
    "num_active_channels": 0,
    "num_inactive_channels": 16,
    "num_peers": 0,
    "block_height": 676121,
    "block_hash": "0000000000000000000b0c067d800c7a6a5265494570d36206e1909fd5dbdada",
    "best_header_timestamp": "1616603905",
    "synced_to_chain": true,
    "synced_to_graph": false,
    "testnet": false,
    "chains": [
        {
            "chain": "bitcoin",
            "network": "mainnet"
        }
    ],
    "uris": [
        "02ce0da01c1e86dd80ef1888febda8905e0ac03290a48a77d5aa9ad574f072040a@84.238.56.45:9735",
        "02ce0da01c1e86dd80ef1888febda8905e0ac03290a48a77d5aa9ad574f072040a@vddxcjf54o6lvzfogjvzcvkitxarkdfa33auh6dfjesjmfjvk35jcwqd.onion:9735"
    ],
    "features": {
        "0": {
            "name": "data-loss-protect",
            "is_required": true,
            "is_known": true
        },
        "5": {
            "name": "upfront-shutdown-script",
            "is_required": false,
            "is_known": true
        },
        "7": {
            "name": "gossip-queries",
            "is_required": false,
            "is_known": true
        },
        "9": {
            "name": "tlv-onion",
            "is_required": false,
            "is_known": true
        },
        "12": {
            "name": "static-remote-key",
            "is_required": true,
            "is_known": true
        },
        "14": {
            "name": "payment-addr",
            "is_required": true,
            "is_known": true
        },
        "17": {
            "name": "multi-path-payments",
            "is_required": false,
            "is_known": true
        }
    }
}

Steps to reproduce

Not clear.

Log

2021-03-24 16:35:22.569 [INF] BTCN: Server listening on [::]:9735
2021-03-24 16:35:22.619 [WRN] SRVR: No channel policy found for ChannelPoint(0000000000000000000000000000000000000000000000000000000000000000:0): 
2021-03-24 16:35:22.620 [WRN] SRVR: No channel policy found for ChannelPoint(0000000000000000000000000000000000000000000000000000000000000000:0): 
2021-03-24 16:35:22.625 [INF] SRVR: Initializing peer network bootstrappers!
2021-03-24 16:35:22.626 [INF] SRVR: Creating DNS peer bootstrapper with seeds: [[nodes.lightning.directory soa.nodes.lightning.directory] [lseed.bitcoinstats.com ]]
2021-03-24 16:35:22.642 [INF] WTWR: Starting watchtower
2021-03-24 16:35:22.642 [INF] WTWR: Creating watchtower hidden service
2021-03-24 16:35:22.643 [INF] DISC: Attempting to bootstrap with: BOLT-0010 DNS Seed: [[nodes.lightning.directory soa.nodes.lightning.directory] [lseed.bitcoinstats.com ]]
2021-03-24 16:35:28.379 [INF] WTWR: Starting lookout
2021-03-24 16:35:28.379 [INF] WTWR: Starting lookout from chain tip
2021-03-24 16:35:28.379 [INF] WTWR: Lookout started successfully
2021-03-24 16:35:28.379 [INF] WTWR: Starting watchtower server
2021-03-24 16:35:28.379 [INF] WTWR: Watchtower server started successfully
2021-03-24 16:35:28.379 [INF] WTWR: Watchtower started successfully
2021-03-24 16:35:28.380 [INF] NTFN: New block epoch subscription
2021-03-24 16:35:28.380 [INF] BTCN: Server listening on [::]:9911
2021-03-24 16:35:58.380 [ERR] WTWR: Unable to fetch block for (height=a5118, hash=000000000000000000069959f003d1d9385ef8af8e297c832a795050945c3f85): Couldn't retrieve block 000000000000000000069959f003d1d9385ef8af8e297c832a795050945c3f85 from network
2021-03-24 16:36:13.442 [INF] BTCN: Lost peer 35.163.36.246:8333 (outbound)
2021-03-24 16:36:13.443 [INF] BTCN: Syncing to block height 676120 from peer 143.178.111.120:8333
2021-03-24 16:36:13.443 [INF] BTCN: Fetching set of headers from tip (height=676120) from peer 143.178.111.120:8333
2021-03-24 16:36:14.024 [ERR] BTCN: Can't accept connection: unable to accept connection from 38.87.54.163:29812: EOF
2021-03-24 16:36:16.016 [INF] BTCN: Lost peer 85.163.53.75:8333 (outbound)
2021-03-24 16:36:19.240 [INF] WTCL: (legacy) Client stats: tasks(received=0 accepted=0 ineligible=0) sessions(acquired=0 exhausted=0)
2021-03-24 16:36:19.255 [INF] WTCL: (anchor) Client stats: tasks(received=0 accepted=0 ineligible=0) sessions(acquired=0 exhausted=0)
2021-03-24 16:36:19.878 [INF] WTWR: Accepted incoming peer 03990192089dde49f9b54da22118c8381c74939fe17ff674553e96b9bbd995e9a1@185.220.101.209:15820
2021-03-24 16:37:21.468 [INF] LTND: Version: 0.12.99-beta commit=v0.12.0-beta-257-g901c192e, build=production, logging=default, debuglevel=info
2021-03-24 16:37:21.469 [INF] LTND: Active chain: Bitcoin (network=mainnet)
2021-03-24 16:37:21.470 [INF] LTND: Opening the main database, this might take a few minutes...
2021-03-24 16:37:21.470 [INF] LTND: Opening bbolt database, sync_freelist=false, auto_compact=false
2021-03-24 16:37:22.795 [INF] CHDB: Checking for schema update: latest_version=22, db_version=22
2021-03-24 16:37:22.796 [INF] LTND: Database now open (time_to_open=1.326097237s)!
2021-03-24 16:37:24.996 [INF] BTCN: Loaded 1904 addresses from file '/home/pi/.lnd/data/chain/bitcoin/mainnet/peers.json'
2021-03-24 16:37:24.997 [INF] BTCN: Registering block subscription: id=1
2021-03-24 16:37:24.998 [INF] BTCN: Broadcaster now active
2021-03-24 16:37:24.998 [INF] RPCS: RPC server listening on 0.0.0.0:10009
2021-03-24 16:37:25.000 [INF] RPCS: gRPC proxy started at 127.0.0.1:8080
2021-03-24 16:37:25.000 [INF] LTND: Waiting for wallet encryption password. Use `lncli create` to create a wallet, `lncli unlock` to unlock an existing wallet, or `lncli changepassword` to change the password of an existing wallet and unlock it.
2021-03-24 16:37:25.004 [ERR] BTCN: Unable to parse IP network for peer t7jlaj6ggyx7s5vy.onion:8333: unsupported IP type
2021-03-24 16:37:25.080 [INF] BTCN: 1 addresses found from DNS seed dnsseed.bitcoin.dashjr.org
2021-03-24 16:37:25.081 [INF] BTCN: 1 addresses found from DNS seed x49.dnsseed.bluematt.me
2021-03-24 16:37:25.105 [INF] BTCN: 1 addresses found from DNS seed x49.seed.bitcoinstats.com
2021-03-24 16:37:25.105 [INF] BTCN: 1 addresses found from DNS seed seed.bitnodes.io
2021-03-24 16:37:25.105 [INF] BTCN: 1 addresses found from DNS seed x49.seed.bitcoin.jonasschnelli.ch
2021-03-24 16:37:25.105 [INF] BTCN: 1 addresses found from DNS seed x49.seed.bitcoin.sipa.be
2021-03-24 16:37:25.284 [INF] BTCN: Waiting for more block headers, then will start cfheaders sync from height 676120...
2021-03-24 16:37:25.284 [INF] BTCN: New valid peer 108.129.28.187:8333 (outbound) (/btcwire:0.5.0/btcd:0.20.1/)
2021-03-24 16:37:25.285 [INF] BTCN: Starting cfheaders sync from (block_height=676120, block_hash=000000000000000000069959f003d1d9385ef8af8e297c832a795050945c3f85) to (block_height=676120, block_hash=000000000000000000069959f003d1d9385ef8af8e297c832a795050945c3f85)
2021-03-24 16:37:25.285 [INF] BTCN: Starting cfheaders sync for filter_type=0
2021-03-24 16:37:25.287 [INF] BTCN: Syncing to block height 676120 from peer 108.129.28.187:8333
2021-03-24 16:37:25.287 [INF] BTCN: Fetching set of headers from tip (height=676120) from peer 108.129.28.187:8333
2021-03-24 16:37:25.308 [INF] BTCN: New valid peer 104.248.83.20:8333 (outbound) (/btcwire:0.5.0/btcd:0.20.1/)
2021-03-24 16:37:25.399 [INF] BTCN: Lost peer 104.248.83.20:8333 (outbound)
2021-03-24 16:37:25.399 [ERR] BTCN: Unable to parse IP network for peer 667rupueapafcnsr.onion:8333: unsupported IP type
...
...
...
2021-03-24 16:37:47.858 [ERR] DISC: Unable to rebroadcast stale announcements: unable to retrieve outgoing channels: channel from self node has no policy
Roasbeef commented 3 years ago

Seems related to https://github.com/lightningnetwork/lnd/issues/3629

patrikmojzis commented 3 years ago

Okay here's more info. I think if I could find a way to abandon the channel causing this issue then it could possibly work. (I did not edit the chan_point to zeros.)

lncli feereport
[lncli] rpc error: code = Unknown desc = no policy for outgoing channel 718214189422280704 
    {
            "channel_id": "718214189422280704",
            "chan_point": "0000000000000000000000000000000000000000000000000000000000000000:0",
            "last_update": 1603001982,
            "node1_pub": "02ce0da01c1e86dd80ef1888febda8905e0ac03290a48a77d5aa9ad574f072040a",
            "node2_pub": "03beddc8adbf7d56a7da15cdaf95d97b24d07088c3571b421c0e6f9d551a210342",
            "capacity": "0",
            "node1_policy": null,
            "node2_policy": {
                "time_lock_delta": 40,
                "min_htlc": "1",
                "fee_base_msat": "1000",
                "fee_rate_milli_msat": "2400",
                "disabled": false,
                "max_htlc_msat": "2107308000",
                "last_update": 1603001982
            }
        },
lncli updatechanpolicy --base_fee_msat 100 --fee_rate 0.000001 --time_lock_delta 144 --chan_point 0000000000000000000000000000000000000000000000000000000000000000:0
[lncli] rpc error: code = Unknown desc = channel from self node has no policy
lncli abandonchannel 0000000000000000000000000000000000000000000000000000000000000000 0
[lncli] rpc error: code = Unknown desc = unable to find closed channel summary

note added: abandonchannel is a dangerous command and can lead to loss of funds if used incorrectly - ask in the LND slack channel if unsure

Roasbeef commented 3 years ago

@parakeety just to confirm, you weren't the one that funded the channel, right? Also you're running a neutrino node right? Finally does the channel how as active or even prevent under listchannels?

If the answer to the first question is no, then you can abandon the channel by recompiling lnd with the dev build tag: make install tags=dev. From there you'll have installed a version of lnd that makes this command useable over the RPC or command line.

patrikmojzis commented 3 years ago
  1. I am not sure. I was restoring from backup though when I set up the node maybe that could've created it. 2. Yes, neutrino. 3. No, not present under listchannels.

I already tried building lnd with dev tag but did not succeed in abandoning the channel, got this error. [lncli] rpc error: code = Unknown desc = unable to find closed channel summary

Roasbeef commented 3 years ago

Interesting, if the channel isn't found under listchanenls, then it doesn't exist from the PoV of lnd at all. Since it errors out before removing the channel from the graph, that portion isn't being executed. I suspect this channel is actually closed as well, and will be properly removed from the graph once #5138 lands.

patrikmojzis commented 3 years ago

I saw that #5138 just landed but after I updated I still see that the issue persists.

wpaulino commented 3 years ago

Your node needs to be up for 30 seconds when running with neutrino to prune the graph. It'll be done again every hour after that.

patrikmojzis commented 3 years ago

I let it restart multiple times then waited few hours and tried again but still no success.

It seems to start pruning.

2021-04-22 22:34:26.265 [INF] CRTR: Initial zombie prune starting
2021-04-22 22:34:26.266 [INF] CRTR: Examining channel graph for zombie channels
2021-04-22 22:34:26.269 [INF] CRTR: Pruning 13 zombie channels

Sometimes it gets to "synced_to_chain": true, "synced_to_graph": true and makes some channels active and then it crashes but most of the time it is only "synced_to_chain": true.

These are the errors I am getting.

2021-04-22 22:49:18.726 [ERR] DISC: Unable to rebroadcast stale announcements: unable to retrieve outgoing channels: channel from self node has no policy
2021-04-22 22:49:48.315 [ERR] SRVR: Unable to connect to 02c38022478fc46244e229be5bfa3d3de94a009c64cd380bad5bfad8d8e2120917@192.168.1.108:9735: socks connect tcp 127.0.0.1:9050->192.168.1.108:9735: unknown error general SOCKS server failure
2021-04-22 22:49:48.668 [ERR] BTCN: utxo scan failed: Couldn't retrieve block 00000000000000000009e1854d60d0ce3a456c1dd59d6e9ad497ca762fc7d26a from network
2021-04-22 22:49:49.216 [ERR] WTWR: Unable to fetch block for (height=a6108, hash=00000000000000000000a2599df86fa2e5349fc9c52402041bd250f5bfba80ee): Couldn't retrieve block 00000000000000000000a2599df86fa2e5349fc9c52402041bd250f5bfba80ee from network
2021-04-22 22:49:57.637 [ERR] BTCN: Unable to parse IP network for peer t7jlaj6ggyx7s5vy.onion:8333: unsupported IP type
wpaulino commented 3 years ago

Ah I see, it's because we don't allow pruning your own channels from the graph. The following patch should get rid of the already spent channel from the graph, but it could also affect any of your other channels within the graph if they are considered prune-able (either edge hasn't been updated in 2 weeks, or both edges are disabled).

diff --git a/routing/router.go b/routing/router.go
index 243bde49..ccc49b1f 100644
--- a/routing/router.go
+++ b/routing/router.go
@@ -817,13 +817,6 @@ func (r *ChannelRouter) pruneZombieChans() error {
            return nil
        }

-       // We'll ensure that we don't attempt to prune our *own*
-       // channels from the graph, as in any case this should be
-       // re-advertised by the sub-system above us.
-       if isSelfChannelEdge(info) {
-           return nil
-       }
-
        // If *both* edges haven't been updated for a period of
        // chanExpiry, then we'll mark the channel itself as eligible
        // for graph pruning.
patrikmojzis commented 3 years ago

Thanks that worked.

It did not fix the crashing though. I checked the log once again and I noticed every time it crashes the log ends with something like: [INF] WTWR: Accepted incoming peer 02d28e82e3c94bea9e26fbebd68a8f4ef1a5fc1c244bfea68bdd4c136730c958d3@83.250.140.254:39242.

Tried to disable WT and it does not crash anymore.

wpaulino commented 3 years ago

Sounds like a panic. What does stdout/stderr show?

wpaulino commented 3 years ago

Closing due to inactivity.