Closed JssDWt closed 1 month ago
Another gossip crash report from v24.08.1: https://github.com/ElementsProject/lightning/pull/7685#issuecomment-2379521485
Reported on Telegram by +steepdawn974:
...
2024-10-02T08:58:33.820Z INFO plugin-bcli: bitcoin-cli initialized and connected to bitcoind.
2024-10-02T08:58:43.407Z **BROKEN** gossipd: gossip_store: checksum verification failed? 32536bf2 should be 67132a62 (offset 3972). Moving to gossip_store.corrupt and truncating
2024-10-02T08:58:43.407Z UNUSUAL 025651f2193a89a44a80d833f0a82da668a3af8438eff2e9633fabb3f6a3748be6-chan#15523: gossipd lost track of announced channel: re-announcing!
2024-10-02T08:58:43.408Z UNUSUAL 02d96eadea3d780104449aca5c93461ce67c1564e2e1d73225fa67dd3b997a6018-chan#15522: gossipd lost track of announced channel: re-announcing!
2024-10-02T08:58:43.408Z UNUSUAL 024a8228d764091fce2ed67e1a7404f83e38ea3c7cb42030a2789e73cf3b341365-chan#15524: gossipd lost track of announced channel: re-announcing!
2024-10-02T08:58:43.464Z INFO plugin-clnrest: REST server running at https://127.0.0.1:3010
2024-10-02T08:58:43.548Z INFO lightningd: --------------------------------------------------
2024-10-02T08:58:43.548Z INFO lightningd: Server started with public key xxxxx, alias xxxxx (color #0362df) and lightningd v24.08
2024-10-02T08:59:37.638Z UNUSUAL lightningd: Bad gossip order: could not find channel 9999999x475x0 for peer's channel update
2024-10-02T09:02:32.335Z **BROKEN** gossipd: Dying channel 863308x1674x0 already deleted?
2024-10-02T09:02:32.335Z **BROKEN** gossipd: gossip_store: bad checksum offset 451: (version v24.08)
2024-10-02T09:02:32.335Z **BROKEN** gossipd: backtrace: common/daemon.c:38 (send_backtrace) 0x55793fd3051b
2024-10-02T09:02:32.335Z **BROKEN** gossipd: backtrace: common/status.c:221 (status_failed) 0x55793fd39bac
2024-10-02T09:02:32.335Z **BROKEN** gossipd: backtrace: gossipd/gossip_store.c:480 (gossip_store_get_with_hdr) 0x55793fd27d90
2024-10-02T09:02:32.336Z **BROKEN** gossipd: backtrace: gossipd/gossip_store.c:491 (check_msg_type) 0x55793fd27dbe
2024-10-02T09:02:32.336Z **BROKEN** gossipd: backtrace: gossipd/gossip_store.c:509 (gossip_store_set_flag) 0x55793fd27f41
2024-10-02T09:02:32.336Z **BROKEN** gossipd: backtrace: gossipd/gossip_store.c:561 (gossip_store_del) 0x55793fd28187
2024-10-02T09:02:32.336Z **BROKEN** gossipd: backtrace: gossipd/gossmap_manage.c:1216 (gossmap_manage_new_block) 0x55793fd2a82f
2024-10-02T09:02:32.336Z **BROKEN** gossipd: backtrace: gossipd/gossipd.c:477 (new_blockheight) 0x55793fd260ff
2024-10-02T09:02:32.336Z **BROKEN** gossipd: backtrace: gossipd/gossipd.c:588 (recv_req) 0x55793fd26529
2024-10-02T09:02:32.336Z **BROKEN** gossipd: backtrace: common/daemon_conn.c:35 (handle_read) 0x55793fd307c6
2024-10-02T09:02:32.336Z **BROKEN** gossipd: backtrace: ccan/ccan/io/io.c:60 (next_plan) 0x55793fdc0056
2024-10-02T09:02:32.336Z **BROKEN** gossipd: backtrace: ccan/ccan/io/io.c:422 (do_plan) 0x55793fdc04e1
2024-10-02T09:02:32.337Z **BROKEN** gossipd: backtrace: ccan/ccan/io/io.c:439 (io_ready) 0x55793fdc059a
2024-10-02T09:02:32.337Z **BROKEN** gossipd: backtrace: ccan/ccan/io/poll.c:455 (io_loop) 0x55793fdc1ee7
2024-10-02T09:02:32.337Z **BROKEN** gossipd: backtrace: gossipd/gossipd.c:672 (main) 0x55793fd26ead
2024-10-02T09:02:32.337Z **BROKEN** gossipd: backtrace: (null):0 ((null)) 0x7f4cb5aa1d09
2024-10-02T09:02:32.337Z **BROKEN** gossipd: backtrace: (null):0 ((null)) 0x55793fd23d29
2024-10-02T09:02:32.337Z **BROKEN** gossipd: backtrace: (null):0 ((null)) 0xffffffffffffffff
2024-10-02T09:02:32.337Z **BROKEN** gossipd: STATUS_FAIL_INTERNAL_ERROR: gossip_store: bad checksum offset 451:
This was on v24.08
Wow, this is completely broken. Is this some weird OS? You seem to be getting bad checksums all the time...
Also, please can you send me gossip_store.corrupt?
We use 32-bit file offsets, but since we stopped filtering gossip spam, the store can grow much larger. I suspect this is causing all kinds of weirdness.
The workaround is to restart (which compacts the gossip store), but I'll simply switch to 64 bit offsets for the point release.
The crash was observed on this branch: https://github.com/breez/lightning/tree/cln-v24.08-breez with commit https://github.com/breez/lightning/commit/bc9e4f56c324216f5f0f15be07f6ad4f9a46e597
The branch contains changes compared to v24.08, namely
https://github.com/ElementsProject/lightning/pull/7628 https://github.com/ElementsProject/lightning/pull/7611 https://github.com/ElementsProject/lightning/pull/7636 But I don't think they were related to the crash.
Notable thing: The gossip store file was 18GB