ElementsProject / lightning

Core Lightning — Lightning Network implementation focusing on spec compliance and performance
Other
2.81k stars 889 forks source link

Core lightning crashes with FATAL SIGNAL 11 #7645

Closed ShahanaFarooqui closed 1 week ago

ShahanaFarooqui commented 2 weeks ago

Core lightning v24.08 is crashing with FATAL SIGNAL 11 if alias is missing.

References: Crash-1 Crash-2

cdecker commented 2 weeks ago

Looks like the culprit is *channel->alias[LOCAL]. Are we missing an alias at this time? We should be filling them in if they are missing, but maybe this is a channel loaded from the DB and predates aliases?

surfac3 commented 1 week ago

in case you would need some data or logs feel free to tell me what should i run and ill post in here.

I am the one who expirienced the issue. Some of those channels can be 2yrs old

I have tried: reiniating backup - no change removing gossip.store - no help clean reflashing of fw of the raspiblitz - no help

I tried to run sudo journalctl -fu clrest if that can help Sep 08 21:41:46 raspberrypi node[3123160]: at process.processTicksAndRejections (node:internal/process/task_queues:82:21) { Sep 08 21:41:46 raspberrypi node[3123160]: errno: -111, Sep 08 21:41:46 raspberrypi node[3123160]: code: 'ECONNREFUSED', Sep 08 21:41:46 raspberrypi node[3123160]: syscall: 'connect', Sep 08 21:41:46 raspberrypi node[3123160]: address: '/home/bitcoin/.lightning/bitcoin/lightning-rpc' Sep 08 21:41:46 raspberrypi node[3123160]: } Sep 08 21:41:46 raspberrypi node[3123160]: Node.js v20.5.1 Sep 08 21:41:46 raspberrypi systemd[1]: clrest.service: Main process exited, code=exited, status=1/FAILURE Sep 08 21:41:46 raspberrypi systemd[1]: clrest.service: Failed with result 'exit-code'. Sep 08 21:41:46 raspberrypi systemd[1]: clrest.service: Consumed 2.715s CPU time. Sep 08 21:42:16 raspberrypi systemd[1]: clrest.service: Scheduled restart job, restart counter is at 10. Sep 08 21:42:16 raspberrypi systemd[1]: Stopped clrest.service - c-lightning-REST daemon for mainnet. Sep 08 21:42:16 raspberrypi systemd[1]: clrest.service: Consumed 2.715s CPU time. Sep 08 21:42:16 raspberrypi systemd[1]: Started clrest.service - c-lightning-REST daemon for mainnet. Sep 08 21:42:18 raspberrypi node[3126884]: node:events:492 Sep 08 21:42:18 raspberrypi node[3126884]: throw er; // Unhandled 'error' event Sep 08 21:42:18 raspberrypi node[3126884]: ^ Sep 08 21:42:18 raspberrypi node[3126884]: Error: connect ECONNREFUSED /home/bitcoin/.lightning/bitcoin/lightning-rpc Sep 08 21:42:18 raspberrypi node[3126884]: at PipeConnectWrap.afterConnect [as oncomplete] (node:net:1595:16) Sep 08 21:42:18 raspberrypi node[3126884]: Emitted 'error' event on LightningClient instance at: Sep 08 21:42:18 raspberrypi node[3126884]: at Socket. (/home/bitcoin/c-lightning-REST/bitcoin/lightning-client-js.js:80:23) Sep 08 21:42:18 raspberrypi node[3126884]: at Socket.emit (node:events:514:28) Sep 08 21:42:18 raspberrypi node[3126884]: at emitErrorNT (node:internal/streams/destroy:151:8) Sep 08 21:42:18 raspberrypi node[3126884]: at emitErrorCloseNT (node:internal/streams/destroy:116:3) Sep 08 21:42:18 raspberrypi node[3126884]: at process.processTicksAndRejections (node:internal/process/task_queues:82:21) { Sep 08 21:42:18 raspberrypi node[3126884]: errno: -111, Sep 08 21:42:18 raspberrypi node[3126884]: code: 'ECONNREFUSED', Sep 08 21:42:18 raspberrypi node[3126884]: syscall: 'connect', Sep 08 21:42:18 raspberrypi node[3126884]: address: '/home/bitcoin/.lightning/bitcoin/lightning-rpc' Sep 08 21:42:18 raspberrypi node[3126884]: } Sep 08 21:42:18 raspberrypi node[3126884]: Node.js v20.5.1 Sep 08 21:42:18 raspberrypi systemd[1]: clrest.service: Main process exited, code=exited, status=1/FAILURE Sep 08 21:42:18 raspberrypi systemd[1]: clrest.service: Failed with result 'exit-code'. Sep 08 21:42:18 raspberrypi systemd[1]: clrest.service: Consumed 2.092s CPU time.

btw isnt this a very similar issue?

surfac3 commented 1 week ago

I tried to update to latest master - CLN v24.08-10-g78b9ccf - https://github.com/ElementsProject/lightning/pull/7650

but it did not help

image

rustyrussell commented 1 week ago

This is a different crash. We are somehow not setting the local alias, presumably a historical channel.

whitslack commented 1 week ago

@rustyrussell: How could this have happened if the database migration of b5bd9072457b6e6a93687f40064859bdbdaac347 had executed?

surfac3 commented 1 week ago

I have pulled the latest master and the fix works!

Succesfuly pulled CLN v24.08-13-g5bd3d51 and it purrs like a kitten