Closed liuchengxu closed 1 year ago
I seem to have encountered a similar issue in a chain I created. It appeared on the validator node with a total size of plots of 8+ TB on HDDs. The cause turned out to be the --disk-concurrency 4
parameter, and setting it to 1
worked fine. I'll try to attach the logs later.
May be useful.
Chain spec: https://codeberg.org/Cepera_Leshii/subspace-testnet-3000/raw/branch/main/chain-spec-raw-amaranth.json
And page on the telemetry site: https://telemetry.subspace.network/#list/0x1b4c752a5c7f4d95a516cb31d3c4a603652fcbe5f5496fc5cd8781cd469e651d
Commit:
b63028b4d03f80e8e4c5c17dcc2221be0e185b00
Logs slice after changing --disk-concurrency
value to 1
:
@SilversterSunset Thanks for these inputs, but I don't think it's related to this issue, what you posted is basically on the farming side, this issue is specific to the executor node which is not mature for the public testing :)
@liuchengxu Well, the error is the same, so I thought this might help you somehow.
The primary number mismatches the new best secondary number, leading to an assertion panic, which never happens locally but occurred when I deploy X-Net 2. The primary number was somehow larger than the expected value, I haven't figured out why and it disappeared after restarting the network. Furthermore, when it occurs, the extrinsic execution on the executor runs into a peculiar error (
InvalidationTransaction::Stale
I believe). I'll see if I can reproduce it locally.Although this assertion can be turned into an error to at least not kill the program, the reason behind the issue needs to be dug out, which can be a glitch somewhere.
https://github.com/subspace/subspace/blob/c3a4eee7c72ebd65ca6cbc63c3aad661e4411fe5/domains/client/domain-executor/src/bundle_processor.rs#L186-L190