ElementsProject / lightning

Core Lightning — Lightning Network implementation focusing on spec compliance and performance
Other
2.85k stars 901 forks source link

AWAITING_UNILATERAL:Attempting to reconnect #2637

Closed ScottTre closed 2 years ago

ScottTre commented 5 years ago

After a little downtime of my node i have channels in AWAITING_UNILATERAL state since days.

Some channels are connected to the remote node:

"connected" : true

and the state is still:

"status" : [ "AWAITING_UNILATERAL:Attempting to reconnect" ],

What happens here? Is there any way to close the channel?

darosior commented 5 years ago

It might be similar to #2462 , can you provide more informations (logs, etc..) or check if what happens to you is the same as in #2462 ?

ScottTre commented 5 years ago

It seems it's the same problem, my node was down for some time. I lost around half of my channels ("normally" closed) and half of the current channels staying in AWAITING_UNILATERAL state since days.

I have no log files.

darosior commented 5 years ago

Yeah it seems to be the same, can you check if the nodes which unilateraly closed the channel are lnd nodes ?

ScottTre commented 5 years ago

How i can do that?

darosior commented 5 years ago

Some node owner fill up information about their nodes on some explorers (like https://1ml.com/node/02809e936f0e82dfce13bcc47c77112db068f569e1db29e7bf98bcdd68b838ee84), if you are lucky some of misbehaving nodes you were connected to did too

ScottTre commented 5 years ago

Yes, most of the nodes are lnd nodes. Direction is also 0 on every node.

darosior commented 5 years ago

Ok, your issue seems to be quite the same as the one I mentionned above (2462).

ScottTre commented 5 years ago

For some test i have set the channel state from AWAITING_UNILATERAL to CLOSINGD_SIGEXCHANGE and i got this messages:

chan #1392: Peer internal error CLOSINGD_SIGEXCHANGE: Can't start closing: no remote info chan #1392: Peer permanent failure in CLOSINGD_SIGEXCHANGE: Internal error: Can't start closing: no remote info chan #1392: Cannot broadcast our commitment tx: they have a future one chan #1392: State changed from CLOSINGD_SIGEXCHANGE to AWAITING_UNILATERAL

What means "Cannot broadcast our commitment tx: they have a future one"?

ZmnSCPxj commented 5 years ago

What means "Cannot broadcast our commitment tx: they have a future one"?

Lightning BOLT spec has the option_data_loss_protect, where we can ask the peer "I have state update # X, do you have # X + 1 or greater?"

If the peer can prove that it has X+1 or greater, then we know "they have a future one".

Now, there seems to be some bug in the interoperation of this option_data_loss_protect feature between lnd and c-lightning. https://github.com/ElementsProject/lightning/issues/2462#issuecomment-471742623

ScottTre commented 5 years ago

Yes, it seems there is a bug, but how to get sail around this bug if the remote nodes don't update there software? I'm currently in a deadlock and can't close the channels.

ZmnSCPxj commented 5 years ago

AWAITING_UNILATERAL means we are already closing the channel, so it seems to be the root reason is "the blockchain is congested".

The "attempting to reconnect" is something we always do on peers we have channels with, including those in half-closed states where we are awaiting transactions to hit the blockchain. Probably we simply neglected to update the billboard after connecting to the peer, to something like "still awaiting unilateral close transaction to hit blockchain" or some such.

CMDRZOD commented 5 years ago

I have this very same issue. Exact symptoms and values as OP. Been looking for a fix, also wondering if blockchain congestion is the reason for the channel sitting in AWAITING_UNILATERAL forever.

darosior commented 5 years ago

@CMDRZOD see #2462

also wondering if blockchain congestion is the reason for the channel sitting in AWAITING_UNILATERAL forever.

As stated above by ZmnSCPxj it might be, though fees are high for unilateral close transactions. Another reason (also stated above) can be that you disconnected from the peer or shut down lightningd while waiting for the unilateral close : AWAITING_UNILATERAL:Attempting to reconnect

CMDRZOD commented 5 years ago

I never actually checked the status of my channels before I shut down my lightningd node. Now I know better and to look at channel status before updating my lightning installation.

Edit: If I did shutdown my node while waiting for unilateral close, is there any way to recover the channel and/or funds?

ZmnSCPxj commented 5 years ago

It should recover automatically. It just takes time. First the unilateral close has to get confirmed, then you have to wait for the delay imposed by the channel counterparty.

ZmnSCPxj commented 5 years ago

Has this been satisfied already? Can we now close this?

ScottTre commented 5 years ago

Something is complete broken. The remote node is offline since some weeks and i can't close the channel. There is an error message "Cannot broadcast our commitment tx: they have a future one" and the channel goes into the "AWAITING_UNILATERAL" state. And now the channel will stay in "AWAITING_UNILATERAL" forever??? Is this normal?

2019-07-05T09:40:22.743Z UNUSUAL lightningd(27657): xxx chan #13575: Peer permanent failure in CHANNELD_SHUTTING_DOWN: Forcibly closed by 'close' command timeout 2019-07-05T09:40:22.743Z BROKEN lightningd(27657): xxx chan #13575: Cannot broadcast our commitment tx: they have a future one 2019-07-05T09:40:22.743Z INFO lightningd(27657): xxx chan #13575: State changed from CHANNELD_SHUTTING_DOWN to AWAITING_UNILATERAL

CMDRZOD commented 5 years ago

Yeah I agree this is a big problem. I've left my node on for a month in this state with no change. The one channel my node has open has been in AWAITING_UNILATERAL for at least a month.

ZmnSCPxj commented 5 years ago

The broken message is important --- the root cause is some problem with data_loss_protect feature, which interacts badly with how the same feature is implemented on lnd. Is the channel with an lnd node?

ScottTre commented 5 years ago

Yes, it is a lnd node.

ZmnSCPxj commented 5 years ago

Okay, unfortunately I am ignorant of how data_loss_protect is incompatible between lnd and C-Lightning. @niftynei knows better I believe. However it may be necessary to do some manual database editing now (the damage has been done, and though I think @niftynei has fixed the incompatibility, any such incompatibility that occurred in older versions will carry over today), or I might try implementing a command that tries to do that database fixing (but in the worst case will lose any funds in the channel), though that will take some days or weeks, unfortunately. Is the channel significant in amount in your side?

ScottTre commented 5 years ago

There are around 15 channels with a negligible amount. I think i can set the channel back to normal in the database because the broadcast failed and the remote node is offline, maybe forever. One question, is this "future commitment tx" that my node missed on the blockchain or is it only a node to node communication message?

ZmnSCPxj commented 5 years ago

One question, is this "future commitment tx" that my node missed on the blockchain or is it only a node to node communication message?

You could try looking in a blockchain explorer?

In general C-Lightning is continuously monitoring the UTXO of every channel and will trigger onchaind if spent. So if the channel exists as far as C-Lightning is concerned, then it means, there is no tx onchain that spends the funding UTXO.

So it is likely a nonde-to-onde intercommuncation failure at some point, likely due to previous (?) incompatibility with lnd regarding data_loss_protect which has indelibly marked the database (at least until you force the database). I can try to figure out how to reverse this so you can close unilaterally without the "future commitment tx" problem, please wait.

darosior commented 5 years ago

FWIW I ran into this too after a power outage (state AWAITING_UNILATERAL for more than a month and lnd nodes won't close the channels. I tried to send buggy protocol messages to force the unilateral close, without success. I finally contacted the node owners who filled their contact on 1ML, hoping for the rest that the owner would close the channels that have not been activated for months... Hopefully (?) more and more people fill up their infos on 1ML and I could open channel with spec-compliants nodes (effectively making them interfaces of the dark side of the network for me).

jsarenik commented 3 years ago

Hello there! Seems to be the same problem here. the channel info (from listpeers) with blocktrainer.de node:

{
  "id": "0229bf5bfd4f29c6cda971a4979605be9a9553e456b5ef3feb795e8dba232e7005",
  "connected": false,
  "channels": [
    {
      "state": "AWAITING_UNILATERAL",
      "scratch_txid": "REMOVED",
      "last_tx_fee": "183000msat",
      "feerate": {
        "perkw": 253,
        "perkb": 1012
      },
      "channel_id": "REMOVED",
      "funding_txid": "REMOVED",
      "close_to_addr": "REMOVED",
      "close_to": "REMOVED",
      "private": false,
      "opener": "local",
      "closer": "local",
      "features": [
        "option_static_remotekey"
      ],
      "funding_allocation_msat": {
        "0229bf5bfd4f29c6cda971a4979605be9a9553e456b5ef3feb795e8dba232e7005": 0,
        "032de5c0f28f9d7d10c0c0b5ec92e83f9bf40def2bf40181c0f4330c57e58a8605": 241820000
      },
      "funding_msat": {
        "0229bf5bfd4f29c6cda971a4979605be9a9553e456b5ef3feb795e8dba232e7005": "0msat",
        "032de5c0f28f9d7d10c0c0b5ec92e83f9bf40def2bf40181c0f4330c57e58a8605": "241820000msat"
      },
      "msatoshi_to_us": 241820000,
      "to_us_msat": "241820000msat",
      "msatoshi_to_us_min": 241820000,
      "min_to_us_msat": "241820000msat",
      "msatoshi_to_us_max": 241820000,
      "max_to_us_msat": "241820000msat",
      "msatoshi_total": 241820000,
      "total_msat": "241820000msat",
      "fee_base_msat": "0msat",
      "fee_proportional_millionths": 0,
      "dust_limit_satoshis": 546,
      "dust_limit_msat": "546000msat",
      "max_htlc_value_in_flight_msat": 18446744073709552000,
      "max_total_htlc_in_msat": "18446744073709551615msat",
      "their_channel_reserve_satoshis": 2418,
      "their_reserve_msat": "2418000msat",
      "our_channel_reserve_satoshis": 2418,
      "our_reserve_msat": "2418000msat",
      "spendable_msatoshi": 238862000,
      "spendable_msat": "238862000msat",
      "receivable_msatoshi": 0,
      "receivable_msat": "0msat",
      "htlc_minimum_msat": 0,
      "minimum_htlc_in_msat": "0msat",
      "their_to_self_delay": 144,
      "our_to_self_delay": 144,
      "max_accepted_htlcs": 30,
      "state_changes": [
        {
          "timestamp": "2021-01-25T16:18:37.895Z",
          "old_state": "CHANNELD_AWAITING_LOCKIN",
          "new_state": "CHANNELD_SHUTTING_DOWN",
          "cause": "user",
          "message": "User or plugin invoked close command"
        },
        {
          "timestamp": "2021-01-29T14:49:47.374Z",
          "old_state": "CHANNELD_SHUTTING_DOWN",
          "new_state": "AWAITING_UNILATERAL",
          "cause": "user",
          "message": "Forcibly closed by `close` command timeout"
        }
      ],
      "status": [
        "AWAITING_UNILATERAL:Attempting to reconnect"
      ],
      "in_payments_offered": 0,
      "in_msatoshi_offered": 0,
      "in_offered_msat": "0msat",
      "in_payments_fulfilled": 0,
      "in_msatoshi_fulfilled": 0,
      "in_fulfilled_msat": "0msat",
      "out_payments_offered": 0,
      "out_msatoshi_offered": 0,
      "out_offered_msat": "0msat",
      "out_payments_fulfilled": 0,
      "out_msatoshi_fulfilled": 0,
      "out_fulfilled_msat": "0msat",
      "htlcs": []
    }
  ]
}

None of the txids are on-chain, and they were replaced with REMOVED above.

Here is what did not help (I guess my node does not know about the closing transaction of that channel):

# Gets all raw transactions the lightning node knows about and broadcasts them
sqlite3 lightningd.sqlite3 "SELECT HEX(rawtx) FROM transactions;" \
  while read line
  do bitcoin-cli sendrawtransaction "$line"
  done

Technical question: Is there any chance of recovering those sats?

None of the txids mentioned in the listpeers output contains anything on-chain. Was the channel really opened from my side? How can I find the initial transaction input?

jsarenik commented 3 years ago

@darosior is there a chance that the other side would have the raw transaction in their logs/database? Why would they if the channel close is initiated from my side?

jsarenik commented 3 years ago

Realizing the transaction was never realized, I just did a dev-forget-channel to get rid of it.

mb300sd commented 3 years ago

I've noticed I have a few channels stuck in this state as well. All from over 10 months ago, when I had some disk issues.

Is there any way to manually broadcast a unilateral close with whatever latest state my node has? 3/4 of the channels are with permanently dead peers with all the funds on my side. I'm fine risking a punishment transaction from the other side (very unlikely, considering they appear to be gone for over a year).

These are very old channels without option_static_remotekey.

jsarenik commented 3 years ago

@mb300sd have a look at https://github.com/ElementsProject/lightning/issues/2637#issuecomment-801459495 and see this down there:

# Run it on a free-standing copy of `lightningd.sqlite3` file
# not the one currently used by `lightningd`

# Gets all raw transactions the lightning node knows about and broadcasts them
sqlite3 lightningd.sqlite3 "SELECT HEX(rawtx) FROM transactions;" \
  while read line
  do bitcoin-cli sendrawtransaction "$line"
  done

Would that help?

mb300sd commented 3 years ago

I'm running postgres, but I did browse through the db and didn't find anything related to those channels.

I did find a dev-sign-last-tx, which did appear to do something - it signed a tx spending the funding output from the channel, seemingly to myself. But now I have these errors, and the output does not appear in listunspent. Does anyone know where the btc went? It was my own node that broadcast the "unknown spend".

2021-07-27T09:13:31.715Z INFO xxxxxxxxxx-chan#xxxxxx: State changed from FUNDING_SPEND_SEEN to ONCHAIN 2021-07-27T09:13:31.745Z BROKEN xxxxxxxxx-onchaind-chan#xxxxxx: Unknown spend of OUR_UNILATERAL/DELAYED_OUTPUT_TO_US by xxxxxxxxxxxxxxxxxxxxxx

mb300sd commented 3 years ago

So the dev-sign-last-tx led to lost funds, thankfully I tested it on a very small channel. I still can't figure out where they went, since that peer was dead for months so it's very unlikely they could claim a cheating tx.

I solved my stuck large channels by going into the db and changing channel state. That caused lightningd to try to reconnect to the peer while they were online and fail the channel properly.

sudalofsatoshi commented 1 year ago

For those who are in the same trouble like me.

Like @mb300sd said, I ve changed the state on my own

  1. Open lightningd.sqlite3 db
  2. Select channel which has the problem
  3. Change the state from 7 (AWAIT_UNILATERAL) to 2 (may be initial state)
  4. Reboot the core-lightning app
  5. Then, the state will be normal (connected), and then you can request the close-channel request.
  6. Everything go happy in 10~15 mins

:)