ElementsProject / lightning

Core Lightning — Lightning Network implementation focusing on spec compliance and performance
Other
2.87k stars 907 forks source link

Node in weird state, fails to open channels #4871

Open ulidtko opened 3 years ago

ulidtko commented 3 years ago

Hi LN Devs!

I'm running a v0.10.0 instance on testnet.

The central issue right now is that it doesn't open channels anymore:

$ lightning-cli fundchannel 02312627fdf07fbdd7e5ddb136611bdde9b00d26821d14d94891395452f67af248 3255604 
{
   "code": -1,
   "message": "Error broadcasting transaction: error code: -26\\nerror message:\\nnon-mandatory-script-verify-flag (Script failed an OP_EQUALVERIFY operation). Unsent tx discarded 0200000000010338662e72e82bf68133d71809437b3fe57302a59a7dd65c747bb7b5c5d6024e660100000000fdffffff3860a2691bb3ff2dc6c344e8af3e87669aac48f61d181909345545b6ab3d9eab0000000000fdffffff71ad93cdb72490af814722a32c2459798a3ca9b51b93c73dbecfc66e7611660d0000000000fdffffff02a3080a0000000000160014c62e592be729336b82220df522791a46497b1e6434ad3100000000002200202438b69b2dcf3d40250e5e1c94819ef0ca6828e29210098341fb3806e1e678a00247304402204b7befb4f4c63b21e172ed03a1aca697acff183437f37f9f2819cc6b67b2a0370220465b31710ae2cc9290cecdf1cb827b07f7a2f7d49b161cb98ff136b662c302550121025fa4d7187714ca781e69c556f7b2811a54c1ea81a82679680e796ed0134b22d802463043021f0ec088bf18dcd7a4e24e72402b30c76f11d0ba5141cf93a34c4e4c400dcaf302205b4131d5dfaeab8c839fc4aa76f1c728558e20dc1e79458b0536b8662b60e5e801210267bcdaf5ef9c16bef8974d383bd3e008d10aa989fd48bf45aeed0b30b1269eb20247304402203f142205fa803911ad3c6eedc2c5a7e1a32daa567fd6497ce85e58a1b0f1a1f6022041ab98c67bc322f96da6d5fba5b9edfa8377239a5bb30177c58cdd38922273e101210399594eebce15409f86224dc4f37607adb42436db68503cb3b0fc868f743b4d9eaa092000"
}

The only hit I got searching for non-mandatory-script-verify-flag is #1680 — which I don't think is related.

What could be a reason, and what should I do?


A secondary issue is that this node seems unable to close one of its channels — I've no idea at all if this is related to the above.

The problematic channel should've been to the LDK-1 node.

Here's how c-lightning sees the channel:

      {
         "id": "0296e20fa99d2940b8b00117e65d27003f0d8f81a0c960f71a5466d1aadf5ea5ea",
         "connected": false,
         "channels": [
            {
               "state": "AWAITING_UNILATERAL",
               "scratch_txid": "36a799181538d74751936eef1423bba1ff19607de5b43122ad73dc089e9a03f8",
               "last_tx_fee": "183000msat",
               "feerate": {
                  "perkw": 253,
                  "perkb": 1012
               },
               "channel_id": "0d97b40ce2b402bb7a59f4b03a3dfd24a869413dc979964183dc0128079ccb4f",
               "funding_txid": "4fcb9c072801dc83419679c93d4169a824fd3d3ab0f4597abb02b4e20cb4970d",
               "close_to_addr": "tb1qvfr0wvrkgfu8w2c25lawhmuwsfdr8wy4jgtjx0",
               "close_to": "00146246f730764278772b0aa7faebef8e825a33b895",
               "private": false,
               "opener": "local",
               "closer": "local",
               "features": [
                  "option_static_remotekey"
               ],
               "funding_allocation_msat": {
                  "0296e20fa99d2940b8b00117e65d27003f0d8f81a0c960f71a5466d1aadf5ea5ea": 0,
                  "023b4f129dd86d14c977f0d3f2819fec1065ec0c0a1498c7b20f207a31d7866b98": 3860324000
               },
               "funding_msat": {
                  "0296e20fa99d2940b8b00117e65d27003f0d8f81a0c960f71a5466d1aadf5ea5ea": "0msat",
                  "023b4f129dd86d14c977f0d3f2819fec1065ec0c0a1498c7b20f207a31d7866b98": "3860324000msat"
               },
               "msatoshi_to_us": 3860324000,
               "to_us_msat": "3860324000msat",
               "msatoshi_to_us_min": 3860324000,
               "min_to_us_msat": "3860324000msat",
               "msatoshi_to_us_max": 3860324000,
               "max_to_us_msat": "3860324000msat",
               "msatoshi_total": 3860324000,
               "total_msat": "3860324000msat",
               "fee_base_msat": "1msat",
               "fee_proportional_millionths": 0,
               "dust_limit_satoshis": 546,
               "dust_limit_msat": "546000msat",
               "max_htlc_value_in_flight_msat": 18446744073709551615,
               "max_total_htlc_in_msat": "18446744073709551615msat",
               "their_channel_reserve_satoshis": 38603,
               "their_reserve_msat": "38603000msat",
               "our_channel_reserve_satoshis": 38603,
               "our_reserve_msat": "38603000msat",
               "spendable_msatoshi": 3821181000,
               "spendable_msat": "3821181000msat",
               "receivable_msatoshi": 0,
               "receivable_msat": "0msat",
               "htlc_minimum_msat": 0,
               "minimum_htlc_in_msat": "0msat",
               "their_to_self_delay": 6,
               "our_to_self_delay": 144,
               "max_accepted_htlcs": 483,
               "state_changes": [
                  {
                     "timestamp": "2021-10-15T14:09:25.488Z",
                     "old_state": "CHANNELD_AWAITING_LOCKIN",
                     "new_state": "CHANNELD_SHUTTING_DOWN",
                     "cause": "user",
                     "message": "User or plugin invoked close command"
                  },
                  {
                     "timestamp": "2021-10-15T14:13:45.816Z",
                     "old_state": "CHANNELD_SHUTTING_DOWN",
                     "new_state": "AWAITING_UNILATERAL",
                     "cause": "user",
                     "message": "Forcibly closed by `close` command timeout"
                  }
               ],
               "status": [
                  "CHANNELD_AWAITING_LOCKIN:Attempting to reconnect"
               ],
               "in_payments_offered": 0,
               "in_msatoshi_offered": 0,
               "in_offered_msat": "0msat",
               "in_payments_fulfilled": 0,
               "in_msatoshi_fulfilled": 0,
               "in_fulfilled_msat": "0msat",
               "out_payments_offered": 0,
               "out_msatoshi_offered": 0,
               "out_offered_msat": "0msat",
               "out_payments_fulfilled": 0,
               "out_msatoshi_fulfilled": 0,
               "out_fulfilled_msat": "0msat",
               "htlcs": []
            }
         ]
      },

Somehow it's both AWAITING_UNILATERAL and also CHANNELD_AWAITING_LOCKIN, huh?..

The funding TX is not on-chain — absent from mempool as well.

I did manually close it (since it wasn't functioning anyway), however the 144 blocks our_to_self_delay has passed, like, days ago if I'm not mistaken.

Same question, what to make of it? Besides blaming LDK, of course :smiley:

Basically, what the user is supposed to do in such a scenario? I guess it's not hard at all to replicate whatever mistakes I did — and end up with similar result. Locked-in liquidity. Any "escape hatches" to accomodate bugs like this, dev-force-forget-channel-unsafe maybe?

Steps to Reproduce

Unknown yet.

I don't know what to look for when minimizing. Somehow posting the complete history of the node feels both inappropriate and involving tedium.

Inquiries for more information welcome, if any.

getinfo

{
   "id": "023b4f129dd86d14c977f0d3f2819fec1065ec0c0a1498c7b20f207a31d7866b98",
   "alias": "redacted",
   "color": "023b4f",
   "num_peers": 5,
   "num_pending_channels": 1,
   "num_active_channels": 4,
   "num_inactive_channels": 0,
   "address": [],
   "binding": [
      {
         "type": "ipv6",
         "address": "::",
         "port": 9735
      },
      {
         "type": "ipv4",
         "address": "0.0.0.0",
         "port": 9735
      }
   ],
   "version": "v0.10.0-1-gf82f397",
   "blockheight": 2099628,
   "network": "testnet",
   "msatoshi_fees_collected": 0,
   "fees_collected_msat": "0msat",
   "lightning-dir": "/lightning/.lightning/testnet"
}
vincenzopalazzo commented 3 years ago

I didn't read in deep yet, I think you can split the issue in two, they seem unrelated.

ulidtko commented 3 years ago

@vincenzopalazzo I care much less about the second issue. Only mentioning it due to potential relevance to the first.

The central issue is not being able to open a channel — for unclear reason; I don't understand why. There's enough coins, there's tcp connectivity... There's also this pretty scary-sounding error Script failed an OP_EQUALVERIFY operation. Where does it come from?

vincenzopalazzo commented 3 years ago

The central issue is not being able to open a channel — for unclear reason; I don't understand why. There's enough coins, there's tcp connectivity...

It is not an issue related to the lightning protocol, but it is to the bitcoin side, I other words you are not able to sign the transaction for the problem that you have here Script failed an OP_EQUALVERIFY. What it the reason? Mh unclear now with the data that you give, for example, what it the version of bitcoin that you are using?

ulidtko commented 3 years ago

@vincenzopalazzo bitcoin-core 0.21

$ bitcoin-cli -testnet  --version
Bitcoin Core RPC client version v0.21.0
ulidtko commented 3 years ago

There's one segwit UTXO in the bitcoin node:

  {
    "txid": "cde023e34c5d3ed99a01834f4207d3d28f9f2774cc9d9c3802f20bca93945160",
    "vout": 1,
    "address": "tb1qhn23p2dj8vlp9zdv33n708w8vzn8lwueh6j6y9",
    "scriptPubKey": "0014bcd510a9b23b3e1289ac8c67e79dc760a67fbb99",
    "amount": 0.00054156,
    "confirmations": 4124,
    "spendable": true,
    "solvable": true,
    "desc": "wpkh([f90d9391/0'/1'/3']02b3882c013daca7c715fdebce8516700fbf5c9b0c4cdf0cee4880c5475a95442e)#susphcft",
    "safe": true
  }

There's more on-chain UTXO's in c-lightningd wallet. Should I thoroughly inspect all of them one-by-one? What to look for? Any test I can perform? Some way to replicate the error from bitcoin node side?

My shallow knowledge about bitcoin tx script lets me interpret Script failed an OP_EQUALVERIFY operation as a pubkey hash mismatch in a lightning-constructed output script — right?

Why could that be, I wonder?

vincenzopalazzo commented 3 years ago

There's one segwit UTXO in the bitcoin node:

Bitcoin core wallet and c-lightning wallet are different wallets, and they don't share any information. Bitcoin core is used only to talk with the network and nothing else (push transaction and get bitcoin network info like fee rate).

My shallow knowledge about bitcoin tx script lets me interpret Script failed an OP_EQUALVERIFY operation as a pubkey hash mismatch in a lightning-constructed output script — right?

Right, and the thing that jumps to my mind is that there is something wrong with your hsm_secret, do you make any "custom" operation with it? usually the sign operation it is generated under the hood with the libwally-core, and the wallet information are stored inside the file hsm_secret, so maybe there is some problem with it caused by some custom operation?

ulidtko commented 3 years ago

there is something wrong with your hsm_secret, do you make any "custom" operation with it?

No... but good idea still, I suspect the hsm_secret may've been overwritten by imprecise deployment orchestration. I'm still checking this possibility.

vincenzopalazzo commented 3 years ago

No... but the good idea still, I suspect the hsm_secret may've been overwritten by imprecise deployment orchestration.

Mh, this is a really good thing to know, because there are not a very strange things that can happen from a c-lightning point of view to be the cause of the error.

niftynei commented 3 years ago

What could be a reason, and what should I do?

My shallow knowledge about bitcoin tx script lets me interpret Script failed an OP_EQUALVERIFY operation as a pubkey hash mismatch in a lightning-constructed output script — right?

Your assessment here is correct. Two of the three inputs to the funding transaction (the one you're failing to broadcast) have different pubkeys associated with them than the ones your node is providing as part of the signature.

They're failing the pubkey hash check.

First input pubkey: 025fa4d7187714ca781e69c556f7b2811a54c1ea81a82679680e796ed0134b22d8 ripemd160(sha256(pubkey)): 9a9fdeee97b2d292d4de32a4664638f261f54788 Transaction output it's spending: OP_0 OP_PUSHBYTES_20 1547ebd369d628ef9cf7ebeb5b716f5ca3cdf42f

(Original tx found here)

Third input pubkey: 0399594eebce15409f86224dc4f37607adb42436db68503cb3b0fc868f743b4d9e ripemd160(sha256(pubkey)): 3118d32d47568227c08b52dfc39be2614d54684e Tx output it's spending: OP_0 OP_PUSHBYTES_20 8e62f481182d5ed5c4c3b51e08724fa5d7b5963a

(Original tx found here)

The one in the middle looks fine.

If your hsm_secret got corrupted, that'd explain this issue, as the pubkeys are typically found by an index off the hsm_secret.

niftynei commented 3 years ago

What does your lightning-cli listfunds show?