nervosnetwork / ckb-light-client

CKB light client reference implementation
MIT License
14 stars 16 forks source link

Panic received: "long fork detected". #147

Closed jordanmack closed 8 months ago

jordanmack commented 1 year ago

I received the following error on v0.2.4 when running normally. The node has always been v0.2.4 and there has been no previous node version for the storage used. It was running without issue for several weeks. Below is the error message. A longer scrollback history will also be added.

[2023-06-25T07:19:43Z WARN  ckb_light_client::protocols::light_client] long fork detected
[2023-06-25T07:19:47Z ERROR ckb_light_client::protocols::light_client::components::send_last_state_proof] Long fork detected, please check if ckb-light-client is connected to the same network ckb node. If you connected ckb-light-client t
o a dev chain for testing purpose you should remove the storage of ckb-light-client to recover.
thread 'GlobalRt-3' panicked at 'long fork detected', src/protocols/light_client/components/send_last_state_proof.rs:275:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
[2023-06-25T07:19:47Z INFO  ckb_network::network] Ban peer "/ip4/35.176.207.239/tcp/8111/p2p/QmSJTsMsMGBjzv1oBNwQU36VhQRxc2WQpFoRu1ZifYKrjZ" for 300 seconds, reason: protocol ProtocolId(120) panic when process peer message
[2023-06-25T07:19:47Z INFO  ckb_network::network] p2p service event: ListenClose { address: "/ip4/0.0.0.0/tcp/18111" }
[2023-06-25T07:19:47Z INFO  ckb_light_client::protocols::synchronizer] SyncProtocol.disconnected peer=SessionId(1)
[2023-06-25T07:19:47Z INFO  ckb_light_client::protocols::synchronizer] SyncProtocol.disconnected peer=SessionId(68425)
[2023-06-25T07:19:47Z INFO  ckb_light_client::protocols::synchronizer] SyncProtocol.disconnected peer=SessionId(68698)
[2023-06-25T07:19:47Z INFO  ckb_light_client::protocols::synchronizer] SyncProtocol.disconnected peer=SessionId(68236)
[2023-06-25T07:19:47Z INFO  ckb_light_client::protocols::synchronizer] SyncProtocol.disconnected peer=SessionId(68461)
[2023-06-25T07:19:47Z INFO  ckb_light_client::protocols::synchronizer] SyncProtocol.disconnected peer=SessionId(68244)
[2023-06-25T07:19:47Z INFO  ckb_light_client::protocols::synchronizer] SyncProtocol.disconnected peer=SessionId(1493)
[2023-06-25T07:19:47Z INFO  ckb_light_client::protocols::synchronizer] SyncProtocol.disconnected peer=SessionId(68700)
[2023-06-25T07:19:47Z INFO  ckb_stop_handler] StopHandler(network) send signal
[2023-06-25T07:19:47Z INFO  ckb_light_client] Done.
jordanmack commented 1 year ago

Scrollback history: scrollback.txt

yangby-cryptape commented 1 year ago

What is "long fork"?

For safety, the CKB light client will NOT rollback automatically if a fork has more than 100 blocks.

So, a panic will be caused if there is a better fork, which has more than 100 blocks, was detected; and users have to handle the situation by themselves.

Why "long fork detected"?

With CKB binaries between version 0.111.0-rc1 and 0.111.0-rc5, the launched time for Edition CKB2023 for testnet was set:

The epoch 6765 was estimated to be started at 2023-06-25 6:50 UTC, and it was started at 2023-06-26 07:01:21 UTC in real.

The 100th block in the epoch 6765 was mined in 2023-06-25 07:12:55 UTC.

In your logs, "long fork detected" was happened in "2023-06-25 07:19:47".

Since there are lots different version of CKB binaries, a fork chain could be existed.

p.s. Some bad things were happended since epoch 6765 and a lot of transactions were pending for more than 1 hour. So, from CKB 0.111.0-rc6, the launched time for Edition CKB2023 for testnet was removed.

How to fix that?

Since, CKB light client only caches 100 blocks before the tip block; the fork point is difficult to be found for long fork.

At present, the only way to fix that, is to remove the data of CKB light client and re-sync again.

jordanmack commented 8 months ago

This morning I checked on my ten v0.3.4 light clients and found that three of them have exited with the long fork error. When I checked on them yesterday they all appeared to be running fine and fully up to sync.

Is this expected behavior? This seems like it would be a maintenance problem if light clients cannot stay online and require you to manually delete data files to get back online.

quake commented 8 months ago

it's not expected, could you upload the log here?

jordanmack commented 8 months ago

client4-log.txt client6-log.txt client8-log.txt

quake commented 8 months ago

it's a known issue, has been resolved in #177 and #178 , we plan to release a new version next week, thanks for your report and detail log.

quake commented 8 months ago

0.3.5 released: https://github.com/nervosnetwork/ckb-light-client/releases/tag/v0.3.5