onflow / flow-go

A fast, secure, and developer-friendly blockchain built to support the next generation of games, apps, and the digital assets that power them.
GNU Affero General Public License v3.0
531 stars 170 forks source link

[Access] Test Execution Data db pruning functionality on Access/Observer nodes #6002

Closed peterargue closed 3 hours ago

peterargue commented 1 month ago

Problem Definition

The Execution Data db currently has support for pruning: https://github.com/onflow/flow-go/blob/25374df29468ae28971873743a216139be57b814/module/executiondatasync/pruner/pruner.go#L38

However, this system has never been used in production, and was only ever support on ENs: https://github.com/onflow/flow-go/blob/fd98915bd84daee51cd7d325020a066c11608cab/cmd/execution_builder.go#L914-L960

It should work as is with some bootstrapping on Access and Observer nodes.

Proposed Solution

Update Access and Observer nodes to support execution data pruning with a configurable height range. Test that pruning works as expected against the current testnet and/or mainnet data (this will be easiest using an observer).

Hopefully, this works. If not, create issues to address anything that comes up.

Definition of Done

Note: Since mainnet and testnet are still using the v0.33.* release track, create your branch using the v0.33 branch. You can backport the changes to master after testing is complete.

peterargue commented 1 month ago

I dug into the code a bit to get a better understanding of how it all works. One addition change that will be needed is to add a call on the access/observer nodes to set the latest "fulfilled" height in the tracker. This is used to determine which height to use as the "latest" by the pruner when choosing the prune height. Here's where it's set on the execution node: https://github.com/onflow/flow-go/blob/4a857bca8d533d478dbafd2aa01007a2d8bc386c/module/executiondatasync/provider/provider.go#L142-L144

That's called within the Provider engine which is only used on execution nodes when a new block is executed. On Access nodes, we'll need to add the call after the requester finishes downloading the next lowest height. The simplest way to do it is probably to add it as a subscriber to the requester https://github.com/onflow/flow-go/blob/4a857bca8d533d478dbafd2aa01007a2d8bc386c/module/state_synchronization/requester/distributer.go#L22