consensus-shipyard / ipc

🌳 Spawn multi-level trees of customized, scalable, EVM-compatible networks with IPC. L2++ powered by FVM, Wasm, libp2p, IPFS/IPLD, and CometBFT.
https://ipc.space
Apache License 2.0
42 stars 39 forks source link

Look back, not ahead, for cross messages #158

Open aakoshh opened 11 months ago

aakoshh commented 11 months ago

Issue type

Bug

Have you reproduced the bug with the latest dev version?

Yes

Version

v0.1.0

Custom code

Yes

OS platform and distribution

Linux

Describe the issue

See https://filecoinproject.slack.com/archives/C04JR5R1UL8/p1700126202426169

https://github.com/consensus-shipyard/fendermint/pull/433 introduced a mechanism where the sync.rs looks ahead for the next block hash to find cross messages at a certain height: https://github.com/consensus-shipyard/fendermint/blob/1cdeda3bcb946633235c11967bf279e09e6ed5b3/fendermint/vm/topdown/src/sync.rs#L410-L421

This, however, is not reflected when a node is syncing with the chain and fetches synchronously: https://github.com/consensus-shipyard/fendermint/blob/1cdeda3bcb946633235c11967bf279e09e6ed5b3/fendermint/vm/interpreter/src/chain.rs#L265-L268

The crux of the issue is that if block N contains a top-down message, it will be stored in the contract at height N, but it will be only available for querying at block N+1 because of deferred execution. The syncer is cheating by storing the effects at height N by trying to figure out the hash of (a non-finalized) block N+1 (or the next non-null block in general) and executing a query there.

By contrast the other version is querying block N looking for the effects at height N which aren't available, by definition.

One relatively simple solution is to flip the problem. Instead of going this way:

I want the cross messages at height N, so I need to look ahead for the first non-null block hash H and query that block about N

we would go:

I want the cross messages at block hash H, so I need to look back for the last non-null block height N, and ask at hash H about height N

That is, instead of looking ahead of the finalized block N looking for a future block where it's effects appear, we would accept that we won't know the effects of N, but instead we would look back to find the last non-null block before N, and send a query at block N asking about this previous height. This should be deterministic, in the past, available for lookup for anyone already finalizing N, and also increase whenever N increases.

Alternative solution

The other approach we discussed is instead of string data in the contract (which costs gas and is expensive) we would emit events for cross messages, which are stored in the SQLite database instead of the EVM store. However we also have to think about how we're going to do bottom-up checkpoints, where we currently look up these messages from the ledger and we know that reaching out to CometBFT is fraught with errors during startup, for example https://github.com/consensus-shipyard/fendermint/pull/426

Repro steps

At the moment we can reproduce locally by launching a node like this:

cargo make --makefile infra/Makefile.toml -e NODE_NAME=validator-3 -e PRIVATE_KEY_PATH=/home/ubuntu/.ipc/validator_3 -e SUBNET_ID=/r314159/t410fnotsxwgnxcjp5phjmgp6n3lnhxvrf3pncnm3oxq -e CMT_P2P_HOST_PORT=3305 -e CMT_RPC_HOST_PORT=3306 -e ETHAPI_HOST_PORT=8545 -e BOOTSTRAPS=b9652fcb07e91a04b82ac261eb61ae31845f814f@147.135.77.89:3305 -e PARENT_REGISTRY=0xc7068Cea947035560128a6a6F4c8913523A5A44C -e PARENT_GATEWAY=0x0341fA160C66aBB112195192aE359a6D61df45cd -e CMT_EXTERNAL_ADDR=135.148.101.41:3305 child-validator

Relevant log output

I[2023-11-16|09:46:02.428] executed block module=state height=570 num_valid_txs=1 num_invalid_txs=0
I[2023-11-16|09:46:02.485] committed state module=state height=570 num_txs=1 app_hash=0171A0E402203024AF30DDFDE78CAF46781EE131E29702D3DEBC84145C0D09C7F1C615EB3A9E
E[2023-11-16|09:46:02.491] Error in validation module=blockchain err="wrong Block.Header.AppHash. Expected 0171A0E402203024AF30DDFDE78CAF46781EE131E29702D3DEBC84145C0D09C7F1C615EB3A9E, got 0171A0E402201F8C88FF95AB953224AE190095C1A0811FF548A428A25F17230C1B6599854DAC"

This is from validator-3: chain interpreter received topdown msgs number_of_messages=0 start=1091848 end=1091848
This is from validator-4: chain interpreter received topdown msgs number_of_messages=1 start=1091848 end=1091848
cryptoAtwill commented 11 months ago

@aakoshh my take would be using events is really killing two birds with one stone. Saving gas and avoids the look ahead/back completely.

aakoshh commented 11 months ago

I agree that https://github.com/consensus-shipyard/ipc-solidity-actors/pull/285 is a good alternative, I didn't know that top-down and bottom-up messages are handled completely separately when I made my comment about the "Alternative solution" in the ticket.