paritytech / polkadot-sdk

The Parity Polkadot Blockchain SDK
https://polkadot.network/
1.81k stars 660 forks source link

Avoid compiling the same wasm twice because of `GenesisBlockBuilder` #3449

Open tmpolaczyk opened 7 months ago

tmpolaczyk commented 7 months ago

Is there an existing issue?

Experiencing problems? Have you tried our Stack Exchange first?

Motivation

After upgrading to polkadot 1.6.0 in tanssi (https://github.com/moondance-labs/tanssi/commit/c427348caf30095f50be529fc6a9aedbc540a08f), we noticed that typescript dev_tests take twice as long now. These tests run tanssi-node --dev for each test, so the issue was that starting a new node was taking longer than before. The cause seems to be the wasm compilation, which takes a considerable amount of time.

After some debugging, it looks like the runtime wasm is being compiled 3 times. One of the redundant ones is fixed by #3447, and the other is this one.

Calling GenesisConfigBuilderRuntimeCaller::new() always creates a new executor. This seems to ignore the wasm cache, and in the end this causes the wasm runtime to be compiled twice.

Request

Add a method like GenesisConfigBuilderRuntimeCaller::with_executor or similar, to allow the user to pass an executor. Use that method in GenesisBlockBuilder::new(), which already has access to the user executor. GenesisConfigBuilderRuntimeCaller should use that executor instead of creating a new one.

Maybe relevant: #2188

Solution

No response

Are you willing to help with this request?

Yes!

michalkucharczyk commented 7 months ago

Calling GenesisConfigBuilderRuntimeCaller::new() always creates a new executor. This seems to ignore the wasm cache, and in the end this causes the wasm runtime to be compiled twice.

I'll look into this, I'd expect that cache is re-used. Otherwise your proposal sounds right at first glance.

tmpolaczyk commented 7 months ago

Thanks, you can reproduce it by adding a log here: https://github.com/paritytech/polkadot-sdk/blob/f2645eec12f080a66228aedbeaecea2c913570ed/substrate/client/executor/src/wasm_runtime.rs#L250

I was able to see two calls with the same code_hash.

michalkucharczyk commented 7 months ago

It seems that implementing your proposal is not that easy (or elegant) due to how involved parties are disjointed.

The storage is built by using BuildStorage::build_storage method from dyn ChainSpec object which is kept in config and converted to BuildStorage by ChainSpec::as_storage_builder.

ChainSpec instance within config is created before the executor is instantiated (kinda chicken and egg problem).

Both BuildStorage and ChainSpec do not provide a mean to inject Executor. We could do this, e.g. by extending ChainSpec trait with set_executor method, but - at the moment - I am not sure if this is right (feels like this method does not really belong there).

The other solution to this problem, which seems more appropriate for me, could be runtime-cache shared (kinda static instance) across multiply instances of Executor instead of per executor cache.

@koute do you think shared-cache makes sense, and is doable? (cc: @bkchr)

michalkucharczyk commented 7 months ago
backtrace for reference ``` 59 { fn: "sc_executor::wasm_runtime::RuntimeCache::with_instance", file: "/home/miszka/parity/10-genesis-config/polkadot-sdk-master/substrate/client/executor/src/wasm_runtime.rs", line: 267 }, 60 { fn: "sc_executor::executor::WasmExecutor::with_instance", file: "/home/miszka/parity/10-genesis-config/polkadot-sdk-master/substrate/client/executor/src/executor.rs", line: 344 }, 61 { fn: " as sp_core::traits::CodeExecutor>::call", file: "/home/miszka/parity/10-genesis-config/polkadot-sdk-master/substrate/client/executor/src/executor.rs", line: 517 }, 62 { fn: "sc_chain_spec::genesis_config_builder::GenesisConfigBuilderRuntimeCaller::call", file: "/home/miszka/parity/10-genesis-config/polkadot-sdk-master/substrate/client/chain-spec/src/genesis_config_builder.rs", line: 73 }, 63 { fn: "sc_chain_spec::genesis_config_builder::GenesisConfigBuilderRuntimeCaller::get_default_config", file: "/home/miszka/parity/10-genesis-config/polkadot-sdk-master/substrate/client/chain-spec/src/genesis_config_builder.rs", line: 89 }, 64 { fn: "sc_chain_spec::genesis_config_builder::GenesisConfigBuilderRuntimeCaller::get_storage_for_patch", file: "/home/miszka/parity/10-genesis-config/polkadot-sdk-master/substrate/client/chain-spec/src/genesis_config_builder.rs", line: 134 }, 65 { fn: " as sp_runtime::BuildStorage>::assimilate_storage", file: "/home/miszka/parity/10-genesis-config/polkadot-sdk-master/substrate/client/chain-spec/src/chain_spec.rs", line: 175 }, 66 { fn: "sp_runtime::BuildStorage::build_storage", file: "/home/miszka/parity/10-genesis-config/polkadot-sdk-master/substrate/primitives/runtime/src/lib.rs", line: 218 }, 67 { fn: "sc_chain_spec::genesis_block::GenesisBlockBuilder::new", file: "/home/miszka/parity/10-genesis-config/polkadot-sdk-master/substrate/client/chain-spec/src/genesis_block.rs", line: 113 }, 68 { fn: "sc_service::builder::new_full_parts_record_import", file: "/home/miszka/parity/10-genesis-config/polkadot-sdk-master/substrate/client/service/src/builder.rs", line: 147 }, 69 { fn: "sc_service::builder::new_full_parts", file: "/home/miszka/parity/10-genesis-config/polkadot-sdk-master/substrate/client/service/src/builder.rs", line: 173 }, 70 { fn: "staging_node_cli::service::new_partial", file: "/home/miszka/parity/10-genesis-config/polkadot-sdk-master/substrate/bin/node/cli/src/service.rs", line: 201 }, 71 { fn: "staging_node_cli::service::new_full_base", file: "/home/miszka/parity/10-genesis-config/polkadot-sdk-master/substrate/bin/node/cli/src/service.rs", line: 421 }, 72 { fn: "staging_node_cli::service::new_full", file: "/home/miszka/parity/10-genesis-config/polkadot-sdk-master/substrate/bin/node/cli/src/service.rs", line: 773 }, ```
bkchr commented 7 months ago

The other solution to this problem, which seems more appropriate for me, could be runtime-cache shared (kinda static instance) across multiply instances of Executor instead of

We had this and we don't want to go back there.

https://github.com/paritytech/polkadot-sdk/blob/12e5e19c2033383fc914bcc286ad16ae565d4734/substrate/client/service/src/builder.rs#L147-L148

We need to solve this here. There could be for example an extra method to GenesisBuilder with_chain_spec or whatever that doesn't take BuildStorage and then pass the executor when building the storage.

michalkucharczyk commented 7 months ago

I spent some time on this. ChainSpec is available as trait object (dyn ChainSpec), what effectively prevents from easy trait extension with generic executor-related methods. I did some work with callback allowing injection of executor here: https://github.com/paritytech/polkadot-sdk/compare/master...mku-chainspec-avoid-double-compilation (which is working) but I am not really happy with this. Another (proper) solution would require some heavier refactor around how Configuration, Executor and ChainSpec are created.

@tmpolaczyk another idea: maybe you could use raw chainspec in your tests to avoid double compilation if that is a huge pain?

tmpolaczyk commented 7 months ago

maybe you could use raw chainspec in your tests to avoid double compilation if that is a huge pain?

I did not think about that, but we use --dev, it is going to be weird to change it to --chain dancebox_dev_raw.json...

Anyway this is not urgent, I managed to improve our test time by using #1641, so now instead of compiling the wasm twice we only compile it once (because of this bug, after it is fixed we are going to use the precompiled wasm only). I liked your idea about a global instance cache, I may implement it in our fork if this doesn't get fixed.

But I agree that a proper solution would be to change the api here, this way as a parachain we don't need any strange hacks and everything works as expected. Not sure how to do it though.