paritytech / polkadot-sdk

The Parity Polkadot Blockchain SDK
https://polkadot.network/
1.78k stars 645 forks source link

Can't run the node after updating to polkadot-v1.8.0 #5744

Open SCJangra opened 4 days ago

SCJangra commented 4 days ago

My node cannot create genesis configuration after updating to polkadot-v1.8.0. There is some sort of allocation error, but I am not sure where it is coming from.

❯ ./target/release/vtb-node build-spec --dev --raw
2024-09-18 01:03:18 Building chain spec    
Error: Service(
  Other(
    "wasm call error Execution aborted due to trap: wasm trap: wasm `unreachable` instruction executed\n
     WASM backtrace:\nerror while executing at wasm backtrace:\n
     0: 0x76e8c - vtb_node_runtime.wasm!__rg_oom\n
     1: 0x3557 - vtb_node_runtime.wasm!__rust_alloc_error_handler\n
     2: 0x38a3 - vtb_node_runtime.wasm!alloc::alloc::handle_alloc_error::h19e50b8409556383\n
     3: 0x371d - vtb_node_runtime.wasm!alloc::raw_vec::handle_error::hf690fbb5ea5797f9\n
     4: 0x259401 - vtb_node_runtime.wasm!frame_support::genesis_builder_helper::create_default_config::h33cbb18701376c12\n
     5: 0x3ebe88 - vtb_node_runtime.wasm!GenesisBuilder_create_default_config"
  )
)

I would appreciate any insight about why this could be happening.

bkchr commented 4 days ago

CC @michalkucharczyk

michalkucharczyk commented 4 days ago

This could be related: https://github.com/paritytech/polkadot-sdk/issues/5419#issuecomment-2312950557

michalkucharczyk commented 4 days ago

Would you share your default RuntimeGenesisConfig? How big is it?

SCJangra commented 4 days ago

@michalkucharczyk It is pretty small, nowhere big enough for 32 MB. The chain spec below contains all fields from RuntimeGenesisConfig, I've removed the runtime code, but that would only increase the size by about 3 MB

{
  "name": "Development",
  "id": "dev",
  "chainType": "Development",
  "bootNodes": [],
  "telemetryEndpoints": null,
  "protocolId": null,
  "properties": {
    "tokenDecimals": 18,
    "tokenSymbol": "VTBC"
  },
  "forkBlocks": null,
  "badBlocks": null,
  "lightSyncState": null,
  "codeSubstitutes": {},
  "genesis": {
    "runtimeGenesis": {
      "code": "0x",
      "patch": {
        "babe": {
          "authorities": [],
          "epochConfig": {
            "allowed_slots": "PrimaryAndSecondaryPlainSlots",
            "c": [
              1,
              4
            ]
          }
        },
        "oracle": {
          "prices": []
        },
        "session": {
          "keys": [
            [
              "5GNJqTPyNqANBkUVMN1LPPrxXnFouWXoe2wNSmmEoLctxiZY",
              "5GNJqTPyNqANBkUVMN1LPPrxXnFouWXoe2wNSmmEoLctxiZY",
              {
                "authority_discovery": "5GrwvaEF5zXb26Fz9rcQpDWS57CtERHpNehXCPcNoHGKutQY",
                "babe": "5GrwvaEF5zXb26Fz9rcQpDWS57CtERHpNehXCPcNoHGKutQY",
                "grandpa": "5FA9nQDVg267DEd8m1ZypXLBnvN7SFxYwV7ndqSYGiN9TTpu",
                "im_online": "5GrwvaEF5zXb26Fz9rcQpDWS57CtERHpNehXCPcNoHGKutQY"
              }
            ]
          ]
        },
        "staking": {
          "canceledPayout": 0,
          "forceEra": "NotForcing",
          "invulnerables": [
            "5GNJqTPyNqANBkUVMN1LPPrxXnFouWXoe2wNSmmEoLctxiZY"
          ],
          "maxNominatorCount": null,
          "maxValidatorCount": null,
          "minNominatorBond": 10000000000,
          "minValidatorBond": 100000000000000,
          "minimumValidatorCount": 1,
          "slashRewardFraction": 100000000,
          "stakers": [
            [
              "5GNJqTPyNqANBkUVMN1LPPrxXnFouWXoe2wNSmmEoLctxiZY",
              "5GNJqTPyNqANBkUVMN1LPPrxXnFouWXoe2wNSmmEoLctxiZY",
              100000000000000,
              "Validator"
            ]
          ],
          "validatorCount": 1
        },
        "sudo": {
          "key": "5GrwvaEF5zXb26Fz9rcQpDWS57CtERHpNehXCPcNoHGKutQY"
        },
        "externalTokens": {
          "account": []
        },
        "system": {},
        "vtbCurrency": {
          "endowedAccounts": [
            [
              "5GNJqTPyNqANBkUVMN1LPPrxXnFouWXoe2wNSmmEoLctxiZY",
              101000000000000
            ],
            [
              "5GrwvaEF5zXb26Fz9rcQpDWS57CtERHpNehXCPcNoHGKutQY",
              101000000000000
            ],
            [
              "5G9vBwTsEEfF8vg4kNyEzrTWmhykysv1mKDwzEuU4CCkxvVT",
              101000000000000
            ],
            [
              "5Cvee4SWvyPrscCaKKZQkZDu4MzYvXTh6fjGeSK3qpJrU4CP",
              101000000000000
            ]
          ],
          "rewardsPerEra": 100000000000,
          "rewardsPoolBalance": 1000000000000000,
          "totalSupply": 40000000000000000,
          "vtbcPrice": 5123456
        }
      }
    }
  }
}
SCJangra commented 4 days ago

This could be related: #5419 (comment)

I tried this anyways, but it didn't work.

michalkucharczyk commented 4 days ago

The failure is here: https://github.com/paritytech/polkadot-sdk/blob/ec7817e5adc2f3a91dd94b0465dd61b4c1b07ab7/substrate/frame/support/src/genesis_builder_helper.rs#L28-L35

Question is what was returned by GC::default, how big is it? ~Just guessing now - but it seems that serialization to json worked correctly, and it failed on converting to Vec, as the stack does not contain serde_json calls.~ (edit: not true)

Is your runtime open-source?

SCJangra commented 4 days ago

@michalkucharczyk I found the issue, it was because I was using an older version (version 15) of substrate-wasm-builder. I couldn't update to version 16 and above because I was getting the following error during compilation.

error[E0152]: duplicate lang item in crate `core`: `sized`.
  |
  = note: the lang item is first defined in crate `core` (which `vtb_node_runtime` depends on)
  = note: first definition in `core` loaded from /home/sachin/vtb-node/target/release/wbuild/vtb-node-runtime/target/wasm32-unknown-unknown/release/deps/libcore-a831b8844985a4df.rlib
  = note: second definition in `core` loaded from /home/sachin/.rustup/toolchains/1.80.0-x86_64-unknown-linux-gnu/lib/rustlib/wasm32-unknown-unknown/lib/libcore-5f7700eb90efd5c9.rlib

I checked the difference between version 15 and 16 and found the WASM_BUILD_STD flag, so I compiled with WASM_BUILD_STD=0 and that worked perfectly fine.

Now the question is, with WASM_BUILD_STD=1, how do I solve the duplicate lang item error?

SCJangra commented 4 days ago

Is your runtime open-source?

Unfortunately, not yet.

michalkucharczyk commented 4 days ago

Now the question is, with WASM_BUILD_STD=1, how do I solve the duplicate lang item error?

With this I cannot help out of my head - never saw this before. WASM_BUILD_STD=1 is supposed to rebuilt rust core libraries (std, core and alloc) and use them in runtime. No idea why toolchain's libcore is being pulled - guessing - maybe some wrong deps in your runtime's Cargo.toml.

Maybe @bkchr or @koute could give some more insights?

koute commented 4 days ago

Now the question is, with WASM_BUILD_STD=1, how do I solve the duplicate lang item error?

Can you completely delete your target dir and try again? It's possible that rustc doesn't properly invalidate the already compiled artifacts when build-std is switched on/off (it's an experimental nightly-only feature after all), and that could be why you're getting duplicate core error.

Another possibility is that maybe build-std is not enabled for all of the standard library crates you're pulling in, so some of them might be using the core that is compiled with build-std, and some of them might be using the precompiled one.

bkchr commented 4 days ago

@SCJangra please try to clean the target directory. Never seen this error before for the core library.

SCJangra commented 4 days ago

@SCJangra please try to clean the target directory.

Done that multiple times, also tried cleaning ~/.cargo/registry.

Another possibility is that maybe build-std is not enabled for all of the standard library crates you're pulling in

I think this is probably the case.

And could it be because I am using the stable rust toolchain? I thought we didn't need the nightly toolchain anymore.

I have this in my rust-toolchain.toml

[toolchain]
channel = "stable"
components = [
    "cargo",
    "clippy",
    "rust-analyzer",
    "rust-src",
    "rust-std",
    "rustc-dev",
    "rustc",
    "rustfmt",
]
targets = [ "wasm32-unknown-unknown" ]
profile = "minimal"
koute commented 4 days ago

And could it be because I am using the stable rust toolchain? I thought we didn't need the nightly toolchain anymore.

We don't. We're using the RUSTC_BOOTSTRAP magic environment variable to enable nightly features on stable (specifically for build-std).

Can you try compiling with a different toolchain versions and see if it fails on all of them? I think we're using 1.77.0 on the CI (at least that's what the name of the Docker image says), so can you try with that first?

SCJangra commented 4 days ago

Can you try compiling with a different toolchain versions and see if it fails on all of them?

Yeah, I tried multiple versions, but unfortunately the same issue. Maybe I am missing crate_name/std in std features some where, I'll have to check all dependencies. Until then, is there any problem with shipping a runtime built with WASM_BUILD_STD=0 to production? What could go wrong with it?

koute commented 4 days ago

Should be fine; the build-std was mostly added to fix this issue, but if you're not seeing it then technically you shouldn't need it.

SCJangra commented 3 days ago

but if you're not seeing it then technically you shouldn't need it.

Got it