Sats stuck in Limbo - "maturity_height": 0 and "blocks_til_maturity": 0

lightningnetwork / lnd

Lightning Network Daemon ⚡️

MIT License

7.69k stars 2.08k forks source link

Sats stuck in Limbo - "maturity_height": 0 and "blocks_til_maturity": 0 #8328

Open jairunet opened 10 months ago

jairunet commented 10 months ago

Background

It's been 4 months since this channel was forced closed, and I have no clue what to do next.

Your environment

version of LND 0.17.3-beta
which operating system (Ubuntu 22.04.3 LTS)
version of btcd, bitcoind, or other backend: Bitcoin Core 25.1.0
any other relevant environment details: This is a DIY installation of Umbrel in Linux.

Steps to reproduce

I am sorry but I am not sure, I am seeing this after what I believe was a forced closed channel.

Expected behaviour

The sats corresponding to my side of the lightning channel should have been returned to the lightning node on-chain wallet.

Actual behaviour

The channel details shows:

{
            "channel": {
                "remote_node_pub": "0364913d18a19c671bb36dd04d6ad5be0fe8f2894314c36a9db3f03c2d414907e1",
                "channel_point": "a33b675ecfa23cfe02ed1522e63c89959ad4df509c61aea8f886a8bb6b5f3d8c:14",
                "capacity": "2000000",
                "local_balance": "79278",
                "remote_balance": "1712079",
                "local_chan_reserve_sat": "0",
                "remote_chan_reserve_sat": "0",
                "initiator": "INITIATOR_REMOTE",
                "commitment_type": "ANCHORS",
                "num_forwarding_packages": "0",
                "chan_status_flags": "",
                "private": false,
                "memo": ""
            },
            "closing_txid": "e541bb7b18e6170f694f050506a9f9a54b705e167142cdddb3bafd60d10dc947",
            "limbo_balance": "3388",
            "maturity_height": 0,
            "blocks_til_maturity": 0,
            "recovered_balance": "330",
            "pending_htlcs": [
                {
                    "incoming": false,
                    "amount": "3388",
                    "outpoint": "8b2149d31c7a014f09eedb55e1b28bd2c291a241c22c77ecd65b947b21b6f3f5:0",
                    "maturity_height": 820740,
                    "blocks_til_maturity": -3182,
                    "stage": 2
                }
            ],
            "anchor": "RECOVERED"
        },

babikpatient6 commented 10 months ago

Are you sure you are sharing correct channel? This F-closed only 24days ago and it's not a bug. Lnd doesn't sweep if not economical, which your 3388sat in limbo is not atm. You can force sweep, but until fees go much lower, you'll pay more than you recover. I think you might try to hold sweeper for long enough to make the sweep economical, ie by waiting for another force closure(s) to pile up in your pendingsweeps before sweeping them all in one batch tx. The more you wait, the more inputs your sweeper gets, making some of these dust utxos economically recoverable again.. Someone smarter pls confirm my logic is correct.. 14d/336h below is just an example, you can use more or less. As long as my node was online during the force closure, I have never had any adverse effects (ie htlc not failing to my side in time) increasing this value from default 30s. Feel free to add to lnd.conf: sweeper.batchwindowduration=336h

jairunet commented 10 months ago

Hello @babikpatient6 , the information I was able to obtain from the command lncli pendingchannels was that what I pasted above and this other one:

{
            "channel": {
                "remote_node_pub": "039243cc1d359b3b93d5c7ff70c933cf14ec2b940c7c0842cfa5616f817a571b24",
                "channel_point": "2915a363f7d55ee003a0e99c68e6a406c4eb7ce179c831152762036c68904c99:1",
                "capacity": "5000000",
                "local_balance": "3986366",
                "remote_balance": "876644",
                "local_chan_reserve_sat": "0",
                "remote_chan_reserve_sat": "0",
                "initiator": "INITIATOR_LOCAL",
                "commitment_type": "ANCHORS",
                "num_forwarding_packages": "0",
                "chan_status_flags": "",
                "private": false,
                "memo": ""
            },
            "closing_txid": "0e4f20929fc686ce0041fdcce9cefe369284b27bc38e67e72220ca219573b42f",
            "limbo_balance": "4106702",
            "maturity_height": 824400,
            "blocks_til_maturity": 475,
            "recovered_balance": "0",
            "pending_htlcs": [
                {
                    "incoming": false,
                    "amount": "120006",
                    "outpoint": "9780db7e58128998053a6c550ce7128d24e8c271b815e9cf27d5f0d1d6e602cb:0",
                    "maturity_height": 824400,
                    "blocks_til_maturity": 475,
                    "stage": 2
                }
            ],
            "anchor": "LIMBO"
        }

This above one I believe will be cleared after the "blocks_til_maturity" reaches?

I appreciate the information you shared

jairunet commented 10 months ago

Are you sure you are sharing correct channel? This F-closed only 24days ago and it's not a bug. Lnd doesn't sweep if not economical, which your 3388sat in limbo is not atm. You can force sweep, but until fees go much lower, you'll pay more than you recover. I think you might try to hold sweeper for long enough to make the sweep economical, ie by waiting for another force closure(s) to pile up in your pendingsweeps before sweeping them all in one batch tx. The more you wait, the more inputs your sweeper gets, making some of these dust utxos economically recoverable again.. Someone smarter pls confirm my logic is correct.. 14d/336h below is just an example, you can use more or less. As long as my node was online during the force closure, I have never had any adverse effects (ie htlc not failing to my side in time) increasing this value from default 30s. Feel free to add to lnd.conf: sweeper.batchwindowduration=336h

I will look into the sweeper setting you shared, but it seems like waiting is the safest choice due to the high fees in the mempool.

babikpatient6 commented 10 months ago

This above one I believe will be cleared after the "blocks_til_maturity" reaches?

Yes

I will look into the sweeper setting you shared, but it seems like waiting is the safest choice due to the high fees in the mempool.

I'm not sure if lnd will ever attempt to sweep dust utxo and even if, 6conf target would have to be like 15sat/vB to breakeven. Since you already have another force closure pending, I wouldn't hesitate and add sweeper line to lnd.conf + reboot before it's block maturity goes 0 on the second FC. Although two extra inputs are probably still not enough to make this dust sweep economical atm, in eithercase it will save you some sats. You can revert it anytime (removing mentioned line +lnd reboot) to trigger the sweeper with matured inputs whenever you want (ie when fees are at local low)

ziggie1984 commented 10 months ago

Lnd will keep the channel in pending until all outputs are resolved. As @babikpatient6 mentioned only positive yielding utxos are swept, I would not bother about it and just keep it there. Don't increase the sweeper.batchwindowduration too much because this can mess with the security assumption of lightning. Don't set it to more than the blocktime (10min).

lukeroberts commented 10 months ago

I have a similar situation here which I think applies to what you said, but I dont quite understand why the pending HTLC outpoints seem to be incorrect? https://lightningcommunity.slack.com/archives/C6BKD3RKR/p1704150462629329

Here are my channels in question:

{
    "channel": {
        "remote_node_pub": "039a0e96ef4f19bf1dfc438941597f32ab82fb8414297e7d42ff8e14d106de08b3",
        "channel_point": "f82fb1c54b7d604bcdf5467b19e505c21c9cc020fa58a61d378c070aaba52391:1",
        "capacity": "1150000",
        "local_balance": "526906",
        "remote_balance": "589991",
        "local_chan_reserve_sat": "0",
        "remote_chan_reserve_sat": "0",
        "initiator": "INITIATOR_LOCAL",
        "commitment_type": "ANCHORS",
        "num_forwarding_packages": "0",
        "chan_status_flags": "",
        "private": false,
        "memo": ""
    },
    "closing_txid": "7b65c242a6700dbd5a9aff3ea18cb7a248c0284751cfda4e834e743ea30e8be3",
    "limbo_balance": "658",
    "maturity_height": 0,
    "blocks_til_maturity": 0,
    "recovered_balance": "330",
    "pending_htlcs": [
        {
            "incoming": false,
            "amount": "658",
            "outpoint": "b95f5ee91545df3f8cfcc45141fc6c4614f2f68c3ff65afbdb236834c3495a74:0",
            "maturity_height": 815108,
            "blocks_til_maturity": -8812,
            "stage": 2
        }
    ],
    "anchor": "RECOVERED"
},
{
    "channel": {
        "remote_node_pub": "037172d2110c4148d6ed0c2790ec8be948458022425dced6794fede834be92af36",
        "channel_point": "2fe0795bca45f0da0693df17804701a5430f48880b9f8482e50edecf0386f6b1:0",
        "capacity": "4000000",
        "local_balance": "1119287",
        "remote_balance": "2112847",
        "local_chan_reserve_sat": "0",
        "remote_chan_reserve_sat": "0",
        "initiator": "INITIATOR_REMOTE",
        "commitment_type": "ANCHORS",
        "num_forwarding_packages": "0",
        "chan_status_flags": "",
        "private": false,
        "memo": ""
    },
    "closing_txid": "520f896bd619bb6cef67ff30d480b3ad2e1c58905731aa35f963d014f1eefa4d",
    "limbo_balance": "991",
    "maturity_height": 0,
    "blocks_til_maturity": 0,
    "recovered_balance": "330",
    "pending_htlcs": [
        {
            "incoming": false,
            "amount": "991",
            "outpoint": "f7040e669e1bf85c61a9fde1cf0c80aa98345ddc2d6aa50701ad2ff18734992a:0",
            "maturity_height": 816949,
            "blocks_til_maturity": -6971,
            "stage": 2
        }
    ],
    "anchor": "RECOVERED"
}

Am I not correct in assuming that the 'outpoint' should point to a valid tx at that maturity height? @jairunet's first post has a similar pending htlc situation

ziggie1984 commented 10 months ago

Am I not correct in assuming that the 'outpoint' should point to a valid tx at that maturity height? @jairunet's first post has a similar pending htlc situation

Ahh correct, I just saw that we do not update this outpoint for anchor channels during restarts. For anchor channels the outpoint changes during operation because the sweep of the outpoint has no fixed txid but can be changed during sweeping (zero-fee-htlcs). Thats why it's not constant but it needs to be persisted during restarts as well. It's a small fix so probably lnd 18.

But apart from the wrong outpoint, lnd behaves as normal so just a representation issue.

babikpatient6 commented 10 months ago

Don't increase the sweeper.batchwindowduration too much because this can mess with the security assumption of lightning. Don't set it to more than the blocktime (10min).

I have been using large sweeper.batchwindowduration value and timing sweeps with mempool for 1,5 y with no adverse effects except of one instance when my node was offline during remote f-closure with outgoing htlc. After boot up my node wouldn't fail that htlc to stage 2 until I temporarily lowered the value back, but I have never observed this if my node was online during the f-closure. In 2023 this setting saved me hundreds of dollars on sweeping..

I would be interested to hear what other risks than those below am I missing about setting sweeper.batchwindowduration long? 1) More difficult funds recovery if node crashes and goes SCB (recovery relies on chantools sweeptimelockmanual, but crash can be prevented with ups, raid etc..) 2) Necessity to intervene and lower the value temporarily in edge cases where outgoing htlc isn't failing stage 1->2

jairunet commented 10 months ago

All, I appreciate your suggestions and the discussion, but I think that for now, I will just wait until the maturity blocks reach for one of the forced close channels and for the other one I guess I will need to patiently wait and see if one day the mempool fees reach a lower number that can allow the forced close to clear on that other pending one that already reached the maturity but it is still showing as pending.

ziggie1984 commented 10 months ago

More difficult funds recovery if node crashes and goes SCB (recovery relies on chantools sweeptimelockmanual, but crash can be prevented with ups, raid etc..) Necessity to intervene and lower the value temporarily in edge cases where outgoing htlc isn't failing stage 1->2

So all are tradeoffs, but having a large batchwindowduration might lead to a situation where a peer could attack you if he finds out that your sweep window is very high. Just one example would be:

A => YOUR NODE => C

A,C is controlled by the attacker, now A sends an HTLC to C, C resolves the HTLC with your node but A does not cooperate to resolve the HTLC on your node. Now LND will go to chain 13 blocks before the maturity of the htlc, claiming it via the preimage path. IF you do not sweep this output with the htlc-success transaction after ~13*10min, 130 minutes your peer can sweep it as well and therefore attack you because you have already send the funds to C, and A is now claiming them becaue your batchwindow was to high. That's why I do not recommend this kind of setting to users which do not understand the risk, if your are a advanced user and monitoring your node professionally you can definitely tweak those setting to save onchain fees ;)

babikpatient6 commented 9 months ago

Now LND will go to chain 13 blocks before the maturity of the htlc, claiming it via the preimage path. IF you do not sweep this output with the htlc-success transaction after ~13*10min, 130 minutes your peer can sweep it as well and therefore attack you because you have already send the funds to C, and A is now claiming them becaue your batchwindow was to high..

Sounds like extremely unlikely scenario irl, although worthy of attention.. Am I understanding this right that "htlc-success tx" currently doesn't bypass batch sweeper in the same way as "htlc-fail tx" normally seems to do? Because as long as my node was online while the force closure happened, having high batchwindow never prevented onchain settlement of failed/expired htlc so far.. Why would it be different for htlc-success? If that's indeed the case, we should definitely have such bypass mechanism as suggested here, because batching can practically save up to ~1/3 blockspace taken by force closures (or slightly more in favorable scenario and a lot of inputs) and even more can be saved in sats terms with patient mempool timing.. With ongoing spam attacks, force closures became the main focal point of recent "lightning doomerism" in LN user community and I think making high batchwindowduration setting "safe for all" * could provide easy alleviation. (Although my favorite would be lnd having the ability to spend matured utxos in pendingsweeps directly to open new channels etc, making sweeps in that context potentially redundant altogether..)

*safe at least regarding htlc settlement while acknowledging chantools as part of recovery process..