bobanetwork / boba-v3

MIT License
0 stars 1 forks source link

Implement getProof method + push upstream #6

Open wsdt opened 2 years ago

mmontour1306 commented 2 years ago

Their issue number is https://github.com/ledgerwatch/erigon/issues/1349

wsdt commented 2 years ago

The issue with this function is that the StateAndHeaderByNumberOrHash doesn't exist in Erigon.

quote:

 this is the rub. In Erigon, there is no data structure that allows you to retrieve state by BlockNr/Hash. 
It only maintains one instance of the state, which is latest/current, and reverse diffs that allow accessing 
individual parts of the state "as of" certain block number. Gathering large parts of the state trie necessary 
for eth_getProof (because it needs to produce hashes of sub-trees) would require reconstitution of historical 
state, which is expensive in RAM and CPU. I tried to prototype it in the past, but gave up. When we get to 
erigon2 upgrade 3, there may be a way to do it more efficiently, but not "for free". It will most likely come 
at a cost of a lot of extra disk footprint
InoMurko commented 2 years ago
When we get to 
erigon2 upgrade 3, there may be a way to do it more efficiently, but not "for free". 

What is introduced in upgrade 3 to make this more efficient? any clues?

wsdt commented 2 years ago

Upgrade 2

Using experience with Upgrade 1, this upgrade is likely to improve the format for static files, with more emphasis on encoding of monotonic integer sequences (e.g. Elias-Fano)

Nodes download, then seed history of state, as well as indices for event logs and call traces, in addition to all the things from Upgrade 1. Big difference from Erigon 1 here is that the granularity of indices is changed to per-transaction, which is likely to improve performance of most historical queries (especially trace_filter). Further on, they automatically produce and seed new static files for state history, event log indices, and call trace indices, meaning that centralised seeder servers will only be required to bootstrap the swarms.

Full replay from genesis is still required to compute the state. However, because most of the history, event logs and call traces are already downloaded, the initial full replay will happen slightly faster. There may also be more simple techniques that use “benefits of the hindsight” to speed up the state computation.

Upgrade 3

Using experience with Upgrade 2, this upgrade is likely to improve the format for static files, with more emphasis on encoding the intermediate commitments, such as patricia trees (hexary and binary), and B+trees, with flexible choice of hash functions.

Nodes download reasonably recent state as a composition of static files, and only use replay to apply recent changes. As with other types of data, further on, new files are automatically produced and seeded. A new complexity here is that static files for static will sometimes need to be removed, as they are getting merged into larger files.

Source: https://erigon.substack.com/p/erigon-2-three-upgrades

InoMurko commented 2 years ago

any work in progress for Upgrade 3 (PRs)?

wsdt commented 2 years ago

Erigon is at 2.1 (upgrade 1) at the moment. 2.2 is at the integration phase, and 2.3 is at the prototyping stage.

Upgrade 2, open PRs:

Otherwise no, doesn't look like it, they are still working on 2.2.

wsdt commented 2 years ago

Further links for reference:

mmontour1306 commented 2 years ago

Would it be possible to implement eth_getProof for the special case of block='latest' (i.e. only using the current state)? That might be good enough for us (at least to support withdrawals; I'm not sure what else Bedrock is using the proofs for).

wsdt commented 2 years ago

The issue with this function is that the StateAndHeaderByNumberOrHash doesn't exist in Erigon.

quote:

this is the rub. In Erigon, there is no data structure that allows you to retrieve state by BlockNr/Hash. 
It only maintains one instance of the state, which is latest/current, and reverse diffs that allow accessing 
individual parts of the state "as of" certain block number. Gathering large parts of the state trie necessary 
for eth_getProof (because it needs to produce hashes of sub-trees) would require reconstitution of historical 
state, which is expensive in RAM and CPU. I tried to prototype it in the past, but gave up. When we get to 
erigon2 upgrade 3, there may be a way to do it more efficiently, but not "for free". It will most likely come 
at a cost of a lot of extra disk footprint

Referring to this, I think this should be possible yes! Will try, maybe even better considering their performance concerns.

mmontour1306 commented 2 years ago

Optimism's op-node is calling GetProof for the predeploys.L2ToL1MessagePasserAddr address. We could potentially pre-compute this proof as each new block is mined, and then store it in a custom database table keyed by block number. This would then let us support proof requests for historical blocks as well as for the current state, as long as they're only asking for that address. We could remove the temporary code if/when upstream Erigon is able to fully support eth_getProof.

wsdt commented 2 years ago

the function stagedsync.SpawnIntermediateHashesStage(stageState, nil, nil, trieConfig, ctx) should be able to calculate proofs depending on the config, normally it would calculate the root.