smol-dot / smoldot

Lightweight client for Substrate-based chains, such as Polkadot and Kusama.
GNU General Public License v3.0
169 stars 45 forks source link

Light client goes out of sync after running for long time #1882

Closed jsamol closed 2 weeks ago

jsamol commented 2 weeks ago

Description

I'm experiencing an issue with the Smoldot light client where, after running for an extended period (usually 4-5 hours), the client goes out of sync with the blockchain. When this happens, any queries to the chain's state return outdated data. However, the client runs fine again when the chains are removed and added back.

Environment

smoldot: 2.0.29
node.js: 20.14.0
os: macOS 14.5

Additional Context

I'm running Smoldot to interact with a parachain on Android using my custom bindings based on wasm-node. My use case heavily depends on the client running stably for a long period. During testing, I observed the following behavior: the client runs flawlessly for the initial 3-4 hours, but then it stops submitting extrinsics, and the data read from storage diverges from the data fetched using the WS connection, as if the client had stopped syncing with the network.

From the client's perspective, everything appears to be fine. There are no visible errors indicating that the extrinsics are invalid or that the storage cannot be read. However, the client never catches up with the correct state unless I remove and re-add the chain specs.

I was able to reproduce the exact same behavior using Node.js and the smoldot NPM package. Interestingly, when I ran both clients (Node.js and Android) simultaneously, both seemed to stop syncing around the same moment.

What I'm still unsure about is whether the issue is related to the load I put on the client or whether it happens regardless. So far, I've been testing it only by periodically interacting with the chain in the following way:


I'm fully aware that the issue I'm reporting is not easy to reproduce or debug. However, I'm able to consistently reproduce it and will be more than happy to assist further in the investigation in any way I can. I'm currently gathering logs and will update this issue as soon as I have more information to share. Well, it takes time 😅

jsamol commented 2 weeks ago

My apologies, I was certain I was running the latest Smoldot version 2.0.29 while reproducing the issue with Node.js, but when I got to collecting logs I noticed it wasn't exactly true and I must have had been running 2.0.28 before.

While 2.0.28 is experiencing the described issue all the time, I can no longer reproduce it with 2.0.29, both with Node.js and on Android. I'm assuming that 2.0.29 must have come with a fix and the report is no longer relevant. I'll keep monitoring the long term stability of the new version, but for now I'm closing the issue.

tomaka commented 1 week ago

It might be that your issue was the same as https://github.com/smol-dot/smoldot/issues/1873