ElementsProject / lightning

Core Lightning — Lightning Network implementation focusing on spec compliance and performance
Other
2.84k stars 901 forks source link

crashed lightningd #3915

Closed thestick613 closed 4 years ago

thestick613 commented 4 years ago

Attempted an MPP payment, which stopped responding after 5 minutes, so i Ctrl+C the pay command. I then attempted to close a channel, which outputed cannot co-op close channel w/ active htlcs in the log. Then i ran listpays to see the status of the MPP payments, which resulted in this crash:

pay: FATAL SIGNAL 11 (version v0.9.0-44-g01a82d3)
0x4183e7 send_backtrace
        common/daemon.c:38
0x418471 crashdump
        common/daemon.c:51
0x7f70415144af ???
        ???:0
0x7f7041568d26 ???
        ???:0
0x403a68 attempt_ongoing
        plugins/pay.c:1638
0x403e8b listsendpays_done
        plugins/pay.c:1820
0x4092ba handle_rpc_reply
        plugins/libplugin.c:551
0x40938f rpc_read_response_one
        plugins/libplugin.c:666
0x409417 rpc_conn_read_response
        plugins/libplugin.c:685
0x42729a next_plan
        ccan/ccan/io/io.c:59
0x42775a do_plan
        ccan/ccan/io/io.c:407
0x427785 io_ready
        ccan/ccan/io/io.c:417
0x428f20 io_loop
        ccan/ccan/io/poll.c:445
0x409ee2 plugin_main
        plugins/libplugin.c:1303
0x407188 main
        plugins/pay.c:2063
0x7f70414ff82f ???
        ???:0
0x402e08 ???
        ???:0
0xffffffffffffffff ???
        ???:0
2020-08-09T11:32:04.220Z INFO plugin-pay: Killing plugin: Plugin exited before completing handshake.
2020-08-09T11:32:04.221Z **BROKEN** plugin-pay: Plugin marked as important, shutting down lightningd!
2020-08-09T11:32:04.271Z INFO plugin-autoclean: Killing plugin: Plugin exited before completing handshake.
2020-08-09T11:32:04.273Z INFO plugin-bcli: Killing plugin: Plugin exited before completing handshake.
2020-08-09T11:32:04.273Z INFO plugin-fundchannel: Killing plugin: Plugin exited before completing handshake.
2020-08-09T11:32:04.274Z INFO plugin-keysend: Killing plugin: Plugin exited before completing handshake.
2020-08-09T11:32:07.170Z INFO plugin-bcli: bitcoin-cli initialized and connected to bitcoind.
vincenzopalazzo commented 4 years ago

What is your version that you compiled from source? can you share the version?

thestick613 commented 4 years ago

Latest from git: "version": "v0.9.0-44-g01a82d3",

vincenzopalazzo commented 4 years ago

@thestick613 In my opinion, this looks like how an event out the normal flow, because you stop the node with the Ctrl + C during operation of payment splitting, so when you force the stop of your node, it can be inside an assignment operation that in the normal operation flow this assignment is running every time and inside the code don't need to check if this variable is null, but is interesting for me to understand what happened inside your node.

From your issue here I see that your listsendpay work fine, you can share the output of this command? (If you can)

I want to understand what is the splitting status of payment

thestick613 commented 4 years ago

I didn't stop the node with Ctrl+C, only the lightning-cli pay command, as it was running for more then 10 minutes, and the lightningd log stopped producing any output.

vincenzopalazzo commented 4 years ago

@thestick613 ops, my fault. Anyway, if you share your output or partial output I and someone here can try to reproduce what the lightningd daemon did and try to understand if there is some propriety with value undefined.

From your message here I understand that your split is very big.

Most of them reverted. I had over 100 pending htlcs, now there are only 7.

Maybe there is some condition that makes pay delay normal, I think that @cdecker know very well the command implementation and he can say better than me if there is some condition that made the pay command slow.

In addition, if you can share an example of your listsendpays I can try to understand what happened inside the listpays. Because the command listpays It makes the compression of the mpp payments before return the response.

vincenzopalazzo commented 4 years ago

In addition, from the @ZmnSCPxj message inside the issue #3925

Note that ^C on lightning-cli will not cancel the closure

In my understanding, if you stop the lightning-cli interface, you did not stop the lightnind during the operation (in this case during the operation of pay). It mean that you could run your listpay message during the pay operation and it is wrong for the lighningd normal flow because you can have the some payment incomplete.

In my opinion, this could be a possible answer at the your stack trace

cdecker commented 4 years ago

For reference this is the code that is crashing (first line in the loop):

https://github.com/ElementsProject/lightning/blob/af4eec7b66d650cba53affeb8675606fdf847e6c/plugins/pay.c#L1637-L1643

From this it seems that either root or b11 is NULL (strcmp may crash if we pass it a NULL). root seems strange since we'd have to have added a NULL to the list which would crash when adding afaik.

@thestick613 are you calling listpays with a bolt11 to filter by?