ElementsProject / lightning

Core Lightning — Lightning Network implementation focusing on spec compliance and performance
Other
2.84k stars 901 forks source link

plugin-pay: crashed during normal operation #7092

Closed ksedgwic closed 7 months ago

ksedgwic commented 8 months ago

Issue and Steps to Reproduce

Observed an unexpected restart, from the log:

Fri 2024-02-16 02:27:45 PST home4 lightningd[95039]: 2024-02-16T10:27:45.909Z INFO    plugin-pay: Killing plugin: exited during normal operation
Fri 2024-02-16 02:27:45 PST home4 lightningd[95039]: 2024-02-16T10:27:45.909Z **BROKEN** plugin-pay: Plugin marked as important, shutting down lightningd!

Here is the log for the minute preceding, log was gathered using journalctl, should have all system services ... journalctl.log

Installed versions:

vls-hsmd (vls-v0.11.0-rc.1-12-g8c09e40-dirty)
 1be7ec79e99974e88a0388aee0278568791a2e41 lightning (v23.11-16-g1be7ec79e)
 96110d250a3cd6e456bcdf0d6fe5bfd3d7a62b6b vls (v0.11.0-rc.1-35-g96110d25)

The lightning version is v23.11 plus 16 VLS mods (mostly hacks to allow integration tests to work w/ VLS)

cdecker commented 8 months ago

Hm, very interesting. The logs do not show any backtrace, which I would have expected in the journald logs. Do you happen to have a core file somewhere on that system that might tell us more about what happened?

ksedgwic commented 8 months ago

I do not see crash logs associated with this event, and do see them in other cases

ksedgwic commented 8 months ago

Saw it again, log has 5 min before and 5 after: 2024-02-21-home4-journalctl.log

And no crash logs

cdecker commented 8 months ago

If you set the --log-level to io you should be able to see all messages being sent between the various processes. I wonder what the last thing coming into the pay plugin was. It might be malformed somehow, and cause a crash that way?

king-11 commented 8 months ago

I have a core dump for a similar crash but I might need sometime to generate the backtrace from it. If it would be helpful I can provide the core dump for now.

vincenzopalazzo commented 8 months ago

I have a core dump for a similar crash but I might need sometime to generate the backtrace from it. If it would be helpful I can provide the core dump for now.

Yes this will be very helpful thanks

king-11 commented 8 months ago
docker container exec lightningd-test gdb --batch -ex "file /usr/libexec/c-lightning/plugins/pay" -ex "core-file /home/lightning/.lightning/testnet/core" -ex "bt"

[New LWP 22]
warning: Section `.reg-xstate/22' in core file too small.
Core was generated by `/usr/libexec/c-lightning/plugins/pay'.
Program terminated with signal SIGSEGV, Segmentation fault.
warning: Section `.reg-xstate/22' in core file too small.
#0  0x00007f4d3dcdee1f in ?? () from /lib/ld-musl-x86_64.so.1
#0  0x00007f4d3dcdee1f in ?? () from /lib/ld-musl-x86_64.so.1
Backtrace stopped: Cannot access memory at address 0x7ffd18c38ff0

The container uses alpine docker image not sure if it bothers gdb

ksedgwic commented 8 months ago

https://chat.openai.com/share/4c7c07ee-60d4-4146-8e88-cd17f66d9052

ksedgwic commented 8 months ago

Got a backtrace! plugin-pay-backtrace.txt

Hopefully the relevant state files: plugin-pay-state.tar.gz