lightningnetwork / lnd

Lightning Network Daemon ⚡️
MIT License
7.64k stars 2.08k forks source link

[feature]: allow small htlc to expire without FC #7673

Open RocketNodeLN opened 1 year ago

RocketNodeLN commented 1 year ago

Small value htlcs timing out causing a FC that cost orders of magnitude more than the expired htlc is absurd. It would be nice to have an configuration setting to allow htlcs to expire without triggering a FC. This can be either an absolute number or a percentage of the channel size.

Example: ef2d8919a5e2e07bf6091b198f932afe56c6096b38f31e49e02aec6f17f55841 This FC was caused by 980 or 1009 sat htlc expiring. Node runners should have the option to let the htlc expire and potentially lose the sats to save the channel rather than going on chain via FC.

legalizemath commented 1 year ago

I want to bump this suggestion

I had a force close today during roughly 500 sats/vbyte fee estimate for next few blocks

So

Seems I might have to create my own feeurl for flatter fee estimates to minimize extreme values for short block targets, but more importantly I would've been infinitely better off not having channel closed when it's unprofitable to do so.

Link to transaction: https://mempool.space/tx/bc3327248abf2acd160991b085d6d470bab2b2ed20ad9fe21d2ec631b0f44e15

babikpatient6 commented 1 year ago

Probably silly question, but beyond my paygrade to know..

Since I started to collect logs of spurious LFCs months ago, I havent ever found single case which wouldnt start like.. ChannelArbitrator(chanid): immediately failing htlc="htlcid" ..from remote commitment.. etc Doesnt matter if there actually is htlc output in closing TX or not (usually there is not, b/c I pay attention and manually try to reconnect peers with soon expiring htlcs). Also it doesnt matter if the channel link was active or not prior to the "spurious htlc time out".. Logs always said "failing htlc"..

So natural question, if feature suggested in this issue is applied and next such spurious "htlc time out" w/o TX output or any sat-value is about to happen.. Would this prevent such FC?

Could this effectivelly fix majority of spurious FCs ftw? Because that would be very big deal.. These FCs, especially in high tx fee environment remain by far the worst pain about running node..

yyforyongyu commented 1 year ago

Dup of #7683

Doesnt matter if there actually is htlc output in closing TX or not (usually there is not, b/c I pay attention and manually try to reconnect peers with soon expiring htlcs).

This doesn't sound right. Could you share some logs or the closing txids to show that there isn't an htlc output in the closing tx?

babikpatient6 commented 1 year ago

@yyforyongyu - Maybe Im just misunderstanding something due to my noobery. Asking for forgivness in advance...

This is how local spurious FC w/o htlc output always starts in my logs.. [INF] CNCT: ChannelArbitrator(chanid:0): immediately failing htlc=<..htlcid> from remote commitment

And when local FC w/ onchain htlc output happens, logs start.. [INF] CNCT: ChannelArbitrator(chanid:1): go to chain for outgoing htlc <..htlcid>: timeout=, blocks_until_expiry=<..>, broadcast_delta=<..> ..2 seconds and 2pages of logs later.. [INF] CNCT: ChannelArbitrator(chanid:1): immediately failing htlc=<..htlcid2> from remote commitment

So the logs are not the same. While "immediately failing htlc=<..htlcid> from remote commitment" part I could ALWAYS find in both cases, "go to chain for outgoing htlc <..htlcid>" I could only find if there was actual onchain htlc output..

Does this clear any confusion? Lmk if you now recognize what have I got wrong here..(?) Or if you still need example of logs?

yyforyongyu commented 1 year ago

[INF] CNCT: ChannelArbitrator(chanid:0): immediately failing htlc=<..htlcid> from remote commitment

In simple terms, this line means chanid:0 is forced closed and the htlc htlcid is not presented in the force close tx, so we need to cancel it to prevent the upstream channel from force closing.

[INF] CNCT: ChannelArbitrator(chanid:1): go to chain for outgoing htlc <..htlcid>: timeout=, blocks_until_expiry=<..>, broadcast_delta=<..>

This line means chanid:1 thinks the htlcid is timed out and it decides to force close.

And I assume chanid:0 and chanid:1 are both managed by the same node?

Or if you still need example of logs?

Yeah it'd be nice if you could DM me the full logs.

babikpatient6 commented 1 year ago

@yyforyongyu

In simple terms, this line means chanid:0 is forced closed and the htlc htlcid is not presented in the force close tx, so we need to cancel it to prevent the upstream channel from force closing.

I appreciate clarification. Also I've seen this happen from the other end few days ago. I saw htlc expiring locally in real time, but then not being included in remotely FC-ed TX..). However it doesnt give me answer to what degree would enabling discussed feature help with this (very common) scenario..

And I assume chanid:0 and chanid:1 are both managed by the same node?

Yes

I will DM you examples of logs in few days.

yyforyongyu commented 1 year ago

@babikpatient6 cool feel free to reach out on keybase or slack, same handle.