BUG - Bajun Parachain fails to produce blocks: runtime panick - set_validation_data inherent needs to be present in every block!

clangenb commented 6 months ago

Describe the bug

The bajun parachain has been onboarded. However, our collators panic when they try to produce a block, see the logs.

I thought we might have to use the wasm-executor because we didn't properly increase the runtime-spec version for the paseo genesis, but it doesn't help: https://github.com/ajuna-network/bajun-parachain/tree/cl/debug-paseo.

I have read in other issues that this could be related to async backing. However, our most recent collator-based on polkadot-v0.10.0, ready for async backing, also fails to produce blocks either with the same error.

The runtime was built with this code:

https://github.com/ajuna-network/bajun-parachain/blob/97a9fa62ff3349f88e1f11dc5de9772b4dce2392/runtime/bajun/src/lib.rs

hbulgarini commented 6 months ago

thanks for the information, just to make double sure: the genesis state and the wasm code shared was an new fresh state or did you migrate it from other testnet as Rococo ?

clangenb commented 6 months ago

Yes, clean state, with the state and wasm taken from here, generated at that commit: https://github.com/ajuna-network/bajun-parachain/tree/97a9fa62ff3349f88e1f11dc5de9772b4dce2392/resources/bajun/westend

clangenb commented 6 months ago

We partly initialized it with that state because we wanted to test a Runtime upgrade to our most recent release, but in the meantime we did this already on production, so we can also reinit the head and runtime to a more resent state, but I would like to understand what is going wrong for the future.

hbulgarini commented 6 months ago

did you change the "relay_chain": "westend", to paseo ? that value should match

clangenb commented 6 months ago

Yes, sorry for the confusion. We used that chain-spec and created the one for paseo here: https://github.com/ajuna-network/bajun-parachain/commit/4b35580295b49f1d0854e4ee411cdfad660269c8

hbulgarini commented 6 months ago

We partly initialized it with that state because we wanted to test a Runtime upgrade to our most recent release, but in the meantime we did this already on production, so we can also reinit the head and runtime to a more resent state, but I would like to understand what is going wrong for the future.

From where was this state taken? was it from Rococo? because if that was the case then you have a state mismatch between expected block headers across different relay chains. Why don't you start completely empty, then you manually set some state and finally you test the upgrade? would it be feasible just to bring the parachain alive?

al3mart commented 6 months ago

Anything in the logs that can hint us what should be addressed ?

hbulgarini commented 6 months ago

We partly initialized it with that state because we wanted to test a Runtime upgrade to our most recent release, but in the meantime we did this already on production, so we can also reinit the head and runtime to a more resent state, but I would like to understand what is going wrong for the future.

From where was this state taken? was it from Rococo? because if that was the case then you have a state mismatch between expected block headers across different relay chains. Why don't you start completely empty, then you manually set some state and finally you test the upgrade? would it be feasible just to bring the parachain alive?

Hey @clangenb! Is there any updates with this?

clangenb commented 6 months ago

Yeah, sorry I am off this week, hence the slow response. Yeah, we have no incentive anymore to get the current registered head and state to work. We will initialize the chain with a new one next week when I am back. As we have access to the parachain manager account, is there anything we can't do on our own?

al3mart commented 6 months ago

We will remove the manager lock so you can handle setting the new head and code with the manager account.

educlerici-zondax commented 6 months ago

hey @clangenb Just checking in to see how things are going with the new chain setup you mentioned last week. Any updates since you're back? Let me know if there’s anything you need help with or something we should chat about.

clangenb commented 6 months ago

I have just been looking into what is necessary, so my understanding of the flow is:

Call set_current_head with the parachain manager
Call schedule_code_upgrade with the parachain manager
Run the updated parachain?

Is this correct?

al3mart commented 6 months ago

Yeah, that is correct :+1:

clangenb commented 6 months ago

I just tried to set the new had with our parachain manager, but I got a bad origin error here. And then I tried to get the parachain info for our parachain to see the locked status and see that it is empty. Can you explain that to me?

al3mart commented 6 months ago

I see, it seems that the parachain was onboarded using the sudoWrapper pallet. Which avoids writing certain information on-chain, like this one.

I am happy to assist re onboard the chain so we can get everything in place. Would take a couple of hours to complete.

Is that OK ?

clangenb commented 6 months ago

Ok sure, thanks a lot!

So the state new data would be:

al3mart commented 5 months ago

@clangenb 2119 is re-onboarded and unlocked. Please, feel free to try it out again as your time allows.

clangenb commented 5 months ago

I did reset the state and the head (I guess I shouldn't have done that. I actually didn't check if you updated the previous head/wasm when you re-onboarded, I just wanted to go through the parachain manager myself to see how it works). Regardless, all seemed well according to this:

But then my parachain keeps complaining that the code doesn't match still. Hence, I looked at the onchain storage and find that the current code is still the old one, and that the code upgrade is planned at a relay chain block which is in the past:

Can you explain this to me?

hbulgarini commented 5 months ago

Ok sure, thanks a lot!

So the state new data would be:

bajun-paseo.state.txt

bajun_runtime-v401.compact.compressed.wasm.txt

@clangenb, for simplicity sake i updated the code and the state in the relay chain. Could you please check now? Maybe restarting the collators?

clangenb commented 5 months ago

Thank you for your quick action! Now that I look at the files, I realize that I used a bad combination of state/wasm though. I am very sorry about that. I have to ask you to do the same thing again:

bajun-paseo.state.txt bajun-paseo.wasm.txt

Thank you for your efforts, and I humbly apologize.

al3mart commented 5 months ago

@clangenb updated

clangenb commented 5 months ago

Awesome. Thanks for your continuous support, we are producing blocks now and I will close the issue. 👍

Do you happen to know what was wrong with the parachain manager approach?

paseo-network / support

BUG - Bajun Parachain fails to produce blocks: runtime panick - set_validation_data inherent needs to be present in every block! #58