paritytech / substrate

Substrate: The platform for blockchain innovators
Apache License 2.0
8.39k stars 2.65k forks source link

Consensus error between wasm and native runtime execution in Metadata_metadata #1495

Closed webmaster128 closed 5 years ago

webmaster128 commented 5 years ago

In the test network "TSP antnet", I get the error

2019-01-20 20:16:30 Consensus error between wasm and native runtime execution at block Hash(0xb2926869e15cb9c1e381cdbeaf6cc6f26c6045fda87c8934c74e56022c25b283)
2019-01-20 20:16:30    Function "Metadata_metadata"
2019-01-20 20:16:30    Native result Ok([70, 127, 1, 0, 20, 69, 118, 101, 110, 116, 52, 24, 115, 121, 115, 116, 101, 109, 8, 64, 69, 120, 116, 114, 105, 110, 115, 105, 99, 83, 117, 99, 99, 101, 115, 115, 0, 4, 148, 32, 65, 110, 32, 101, 120, 116, 114, 105,  ...
2019-01-20 20:16:30    Wasm result Ok([70, 127, 1, 0, 20, 69, 118, 101, 110, 116, 52, 24, 115, 121, 115, 116, 101, 109, 8, 64, 69, 120, 116, 114, 105, 110, 115, 105, 99, 83, 117, 99, 99, 101, 115, 115, 0, 4, 148, 32, 65, 110, 32, 101, 120, 116, 114, 105, 110, 115, 105, 99, 32, 99, 111, 109, 112, 108, 101, 116, 101, 100, 32, 115, 117, 99, 99, 101, 115, 115, 102, 117, 108, 108, ...

full consensus_error_Metadata_metadata.log as gist

Both values share the same prefix but are different somewhere.

I did not do anything with this chain other than https://github.com/webmaster128/tsp-networks/blob/master/InstallSubstrateFromSource.md#install-substrate-from-source and have no idea what Metadata_metadata is

bkchr commented 5 years ago

Is this a local test network? When did you install substrate? Today? Metadata_metadata is the metadata function of the Metadata runtime trait. I assume you tried to connect with the gui to your node?

xlc commented 5 years ago

Metadata_metadata, as the name implies, it is about the metadata of the runtime (i.e. module names, events, methods, etc). I believe what happen here is that you have modified the runtime modules (e.g. add a new one), without change the version

https://github.com/paritytech/substrate/blob/ec38ea35d02d57787c959a7c56d9992d06ecc2ba/node/runtime/src/lib.rs#L101

So that it was assuming the wasm runtime version is compatible with native runtime version, which are actually not, and generating this error.

I had similar error before and update version and rebuild (both native and wasm) fixes it.

webmaster128 commented 5 years ago

Is this a local test network?

Well, let's say private but all the secrets are checked in at https://github.com/webmaster128/tsp-networks, so everybody can hack it.

When did you install substrate? Today?

Jep, version 0.10.0-78bb4c0

I assume you tried to connect with the gui to your node?

Indeed, I tried to connect to my node using substrate-ui. Connection to the websocket worked but then I always get

Connection open
Initialising runtime
Reconnecting
Connection open
Initialising runtime
Reconnecting.
Connection open
Initialising runtime
Reconnecting.
Connection open
Initialising runtime
Reconnecting.

and the browser hangs.

webmaster128 commented 5 years ago

I believe what happen here is that you have modified the runtime modules (e.g. add a new one), without change the version

I am not competent enought (yet) to change the runtime. All I did was https://github.com/webmaster128/tsp-networks/blob/master/InstallSubstrateFromSource.md#generate-new-chain, i.e. take whatever build-spec gives me by default and update validators and balances. So far I did not come across a spec_version and don't see a place where I could have set it.

xlc commented 5 years ago

Is the substrate version you used to generate the json file match to the substrate version you run it? There may be incompatible changes without bumping spec_version. Try reset everything which should fixes the issue (unless there is a deeper bug involved in wasm compilation). i.e. purge chain, regenerate genesis json file, run the chain with the generated json file with same substrate version

webmaster128 commented 5 years ago

Is the substrate version you used to generate the json file match to the substrate version you run it?

Yes, generated with substrate 0.10.0-78bb4c04-x86_64-linux-gnu and running with docker tag 0.10.0-78bb4c0

I assume you tried to connect with the gui to your node?

Confirming that there is no error when syncing from #0 unless I connect from substrate-ui.

One other thing I noted: The compilation substrate --chain ~/"$NETNAME.json" build-spec --raw > ~/"$NETNAME.raw.json" is not deterministic and produces new results every time I run it. Is that desired?

webmaster128 commented 5 years ago

Try reset everything which should fixes the issue (unless there is a deeper bug involved in wasm compilation). i.e. purge chain, regenerate genesis json file, run the chain with the generated json file with same substrate version

this does not change the behaviour

xlc commented 5 years ago

OK. After I upgrade everything to latest and now having the same error.

xlc commented 5 years ago

I can confirm wasm generated metadata is corrupted.

Using latest substrate and did a ./scripts/build.sh, which updates runtime Cargo.lock and wasm files.

My result and script to generate ascii text: https://gist.github.com/xlc/3c9f6b0cf81412f0173d0c8e29650ef6

Diff: https://difff.jp/en/wvsci.html

This completely breaks polkadot.js apps https://github.com/polkadot-js/api/issues/609

For my custom runtime, the overridden section stops at some position and after the data are correct.

Must be related to #1460 as this is a typical memory management issue

webmaster128 commented 5 years ago

🎉 confirming 0.10.0-4d0eea0 allows connecting via substrate-ui. Thanks!