smol-dot / smoldot

Lightweight client for Substrate-based chains, such as Polkadot and Kusama.
GNU General Public License v3.0
180 stars 48 forks source link

Make the light-client more robust to panics by using `panic=unwind` #519

Open tomaka opened 1 year ago

tomaka commented 1 year ago

This isn't possible in Rust yet, but in principle we could use panic=unwind when compiling the wasm node. Then, the code could be tweaked in order to restart services if they die.

tomaka commented 1 year ago

https://github.com/rust-lang/rust/pull/111322/ is included in Rust 1.72

tomaka commented 1 year ago

Unfortunately, using https://github.com/rust-lang/rust/pull/111322 seems incompatible with using wasm32-wasi-threads for #91

Or maybe we could do threads manually instead of using wasm32-wasi-threads.

tomaka commented 1 year ago

Ignoring the wasm-node, I think it's a good idea to do this in situations where the light client is embedded in a program where unwinding is enabled.

One blocker is the design of the "networking events receivers". It's not possible right now to create a new receiver in case the sync/runtime/etc service dies and takes the receiver with it in its death. And similarly if the networking service dies things are complicated and a simple receiver is maybe not the most appropriate way.

tomaka commented 1 year ago

One issue with the latest comment is that it forces the UnwindSafe trait on Platform. A bigger issue is (if I'm not mistaken) that async functions seem to generate futures that don't implement UnwindSafe.

tomaka commented 1 year ago

The first thing to do would be to refactor the RuntimeService to have a clear separate between foreground and background (which is something I think should be done no matter what).

tomaka commented 10 months ago

Here are the difficulties I've noted for each module:

Another important point is that subscription IDs should be assigned by the frontend, in order to avoid race conditions.

tomaka commented 10 months ago

After https://github.com/smol-dot/smoldot/pull/1376, I've tried to compile smoldot with the instructions given here: https://github.com/rust-lang/rust/pull/111322

This increases the binary size (expectedly) from 2.78 MiB to 3.0 MiB I guess the trade-off could be worth it? Not totally sure.

Also, this unfortunately requires defining the eh_personality language item, which I obviously can't do properly.

No matter what, as explained above, we still want smoldot-light-base to be able to recover in case of panic. This just wouldn't apply to the wasm node just yet.

tomaka commented 10 months ago

Services to make panic-resilient: