lightningnetwork / lnd

Lightning Network Daemon ⚡️
MIT License
7.63k stars 2.07k forks source link

[bug]: fees default to 253sat/kw if fees cannot be downloaded from feeurl #8688

Closed JaviLib closed 1 month ago

JaviLib commented 4 months ago

Background

Continuing from this bug https://github.com/lightningnetwork/lnd/issues/8675 which is also commented here https://github.com/ElementsProject/lightning/issues/7254 it seems that LND defaults to 253sat/kw when the feeurl cannot be downloaded because of some reason, like bad internet connection or timeout from the feeurl server.

From that moment on, all channels get disabled because of extremely low fees, and LND nodes are able to reconnect and renegotiate when feeurl works again, but for some reason, CLN nodes connected aren't able to renegotiate the new fees.

Your environment

Steps to reproduce

Make feeurl not available temporarily or timeout it, either at startup or during operation, and check how negotiated fees go down to 253sat/kw. Second, if there is any channel open with a CLN node, watch how the renegotiation becomes imposible, and channels got disabled. If there is a pending HTLC then the channel gets force closed at timeout.

Expected behaviour

At least use the previous sane retrived fees, or do not try to renegotiate fees with any channel, returning a warning and waiting until they can be retrived.

Actual behaviour

Tries to renegotiate fees at ridiculous values, and cannot renegotiate fees with CLN when they come to normal.

yyforyongyu commented 4 months ago

I think we should return the error here to be handled by the callers instead of returning the fee floor, https://github.com/lightningnetwork/lnd/blob/2089a88f4b5e97a19a9d0a3940222cc90946701a/lnwallet/chainfee/estimator.go#L778-L790

Looking at other implementations of EstimateFeePerKW, think we should do the same - upper systems need to know there's an error in fee estimation instead of defaulting to the default fee rate floor.

Roasbeef commented 4 months ago

We should also smooth/clamp the updates as well. This way we avoid adjusting too sharply in either direction.

ziggie1984 commented 4 months ago

We should also smooth/clamp the updates as well. This way we avoid adjusting too sharply in either direction.

I wonder what's the best strategy to clamp the fee-rate, percentage of the current feerate seems to be not the best design when values are small hmm ?

Moreover maybe we should unify the behaviour for example for the bitcoind estimator we will return the fallback fee in case there is an error while fetching the data.

JaviLib commented 4 months ago

We should also smooth/clamp the updates as well. This way we avoid adjusting too sharply in either direction.

I wonder what's the best strategy to clamp the fee-rate, percentage of the current feerate seems to be not the best design when values are small hmm ?

Moreover maybe we should unify the behaviour for example for the bitcoind estimator we will return the fallback fee in case there is an error while fetching the data.

As @yyforyongyu said, best thing in case of error is to accept what the other party suggests. And in case the other party doesn't provide any value, just use the latest ones retrived.

In case of sudden change but no error, yes, it is probably better to clamp it over the course of, for example, 6 blocks.

For any of the two cases, you would need to store the information of the latest feeurl retrived, probably in a small separate file.

You would also need to return an error if too much time has passed since the last retrival happened. Again, I would suggest returning an error after 6 blocks without retriving anything. Also an error should be provided just at the beginning, exiting the app, if the first retrival fails and the file with the stored values does not exist or is too old.

rolznz commented 3 months ago

This caused all our LDK nodes who had channels with our LSP (which runs LND) to be force closed. This is a catastrophic bug, I hope a fix is prioritized.

2024-05-10 18:38:55 ERROR [lightning::ln::channelmanager:7150] Closing channel 04259a5524219241061608caa6216811cfcfcf713291fc3c839dde1d47a90a72 due to close-required error: Peer's feerate much too low. Actual: 253. Our expected lower limit: 2993
2024-05-10 18:38:55 ERROR [lightning::ln::channelmanager:8971] Force-closing channel: Peer's feerate much too low. Actual: 253. Our expected lower limit: 2993
2024-05-10 18:38:55 DEBUG [lightning::ln::channelmanager:2893] Finishing closure of channel due to Channel closed because of an exception: Peer's feerate much too low. Actual: 253. Our expected lower limit: 2993 with 0 HTLCs to fail
2024-05-10 18:38:55 INFO  [lightning::chain::channelmonitor:2796] Applying force close update to monitor 04259a5524219241061608caa6216811cfcfcf713291fc3c839dde1d47a90a72 with 1 change(s).
Roasbeef commented 2 months ago

One other thing to mention: the JSON schema of the feeurl is relatively simple. If you need a highly dependable endpoint, then I recommend you run your own instead of pointing to some that may be publicly available that have no uptime guarantees/promises.

mrfelton commented 2 months ago

We ran into this issue too. It’s one of the reasons why I created the fee estimator project which I discuss in https://strike.me/blog/blended-bitcoin-fee-estimations/ and built it out to have multiple layers of fallback and redundancy.

You can run and host it yourself and connect it up to multiple different fee sources that you run, or a combination of ones that you run plus publicly available ones.

We run it with multiple replicas and fallbacks connected our own memoool instance, the public mempool.space instance, our own explora instance, blockstream.info, and our own bitcoind node. It also has sanity checks to lookup the current block height from multiple sources and prioritise sources with the most current block height or ignore estimates that are older than some configured threshold, as well as a cache to ensure it’s efficient and robust.

We often experience times where at least one of two of those sources are not available, but for all to be unavailable at once is extremely unlikely and has never happened to us.

see https://github.com/LN-Zap/bitcoin-blended-fee-estimator

That said, this issue really will be really good one to resolve as it can be a disaster if you don’t have those multiple layers of redundancy setup.

Roasbeef commented 1 month ago

Fixed by https://github.com/lightningnetwork/lnd/pull/8891

This comment also still stands: https://github.com/lightningnetwork/lnd/issues/8688#issuecomment-2204856172