paseo-network / support

Support tasks for Paseo network
3 stars 2 forks source link

Integritee parachain stalled. #34

Closed hbulgarini closed 6 months ago

hbulgarini commented 8 months ago

Integritee parachain on Paseo (2015) is currently stalled and not producing blocks. The integritee team has acknowledge that this has happened after a reserve transfer from Paseo relay chain to the parachain.

The provided the logs: https://gist.github.com/brenzi/e3869b59f7eb61fe74a8a56f188d8039

Also the logs from a validator were provided mar23.txt

al3mart commented 7 months ago

After deploying Pop Network and testing reservedTransfers from the relay to the parachain we haven't seen any errors on our side.

We were able to have PAS as the native token for Pop Network, being the only issuance the once coming from these reserved transfers.

An example of a call to do this:

An event of the message being process without errors: https://polkadot.js.org/apps/?rpc=wss%3A%2F%2Frpc1.paseo.popnetwork.xyz#/explorer/query/0x994933f318af2ba97c065729a9b02657b734bf3328a2731a658ab4c830411aef

Our xcm config: https://github.com/r0gue-io/pop-node/blob/main/runtime/testnet/src/xcm_config.rs

hbulgarini commented 7 months ago

The issue has been solved asking the integritee team to upgrade their collators to v1.9.0 . Apparently, something is not working as expected with the collator running under 1.6.0

brenzi commented 7 months ago

A few more observations:

bkchr commented 7 months ago

The provided the logs: https://gist.github.com/brenzi/e3869b59f7eb61fe74a8a56f188d8039

The logs show 2 blocks being produced, not sure how this is related to the chain being stalled.

brenzi commented 7 months ago

For context: that log shows the last block production ever. It is the absence of any error message which worries me.

6794 will never get finalized by relay

brenzi commented 7 months ago

After reseting the parachain to genesis we retried the same. More verbose logs here: https://gist.github.com/brenzi/5b4daffbf58f2e21a8257aacd467202e

block 538 never gets finalized

bkchr commented 7 months ago

@hbulgarini @al3mart it would be nice if you could maybe reproduce this again.

fetch_pov_job err=FetchPoV(NetworkError(NotConnected)) para_id=Id(2015) pov_hash=0x24a699a8fd415d0b06258d19bbad092c4f17f3ba4dd18984b22079d28bf31575 authority_id=Public(a05549b2b27de363328ad064c93a39bce025eb927d412d42db84fa4f6b66c040 (14dDzr5B...))

These messages are suspicious. I would need parachains=debug logs from the collator and the validators.

hbulgarini commented 7 months ago

@hbulgarini @al3mart it would be nice if you could maybe reproduce this again.

fetch_pov_job err=FetchPoV(NetworkError(NotConnected)) para_id=Id(2015) pov_hash=0x24a699a8fd415d0b06258d19bbad092c4f17f3ba4dd18984b22079d28bf31575 authority_id=Public(a05549b2b27de363328ad064c93a39bce025eb927d412d42db84fa4f6b66c040 (14dDzr5B...))

These messages are suspicious. I would need parachains=debug logs from the collator and the validators.

Sure. We need some coordination here:

We will reach you out when this is ready so we can reproduce the issue.

(Re opening issue to track the error)

bkchr commented 7 months ago

What does closing this issue means @educlerici-zondax? AFAIK @hbulgarini wnated to reproduce this?

al3mart commented 7 months ago

Yeah, I believe we should not close this for now. We were waiting for some validators to have certain logs on this week. So we should be able to get things going soonish, hopefully.

hbulgarini commented 7 months ago

What does closing this issue means @educlerici-zondax? AFAIK @hbulgarini wnated to reproduce this?

Indeed, it was closed by mistake during a board cleanup session.

As an update: I'm afraid that we couldn't find anyone yet to deploy the collators.

@brenzi taking into account that you have the best understanding for this, do you think you coould deploy the 1.6 collators and try to reproduce the error together sometime this week?

hbulgarini commented 6 months ago

Closing for now as issue was not reproducable.