Open michaelsproul opened 3 years ago
Some notes for non-michael readers and myself
HeadTracker is used for two things currently:
Blocks and states are persisted in the hot DB by root. To figure out what to prune you need to build a block DAG and compute what branches are non-viable / abandoned. Some ways to do it:
RootsIterator
requires to load a full BeaconState every 8192 slots.The pruning routine must be crash safe, options 2 and 3 risk "forgetting" data or causing inconsistencies like https://github.com/sigp/lighthouse/issues/4773 if not implemented properly.
The current implementation uses the RootsIterator
which loads a full BeaconState every 8192 slots. With a state of size of 100MB (likely more represented in memory) that's 12 KB per slot. Blinded beacon blocks are ~50KB? so that's not too far off. I don't see any immediate issues with such approach besides the others being more optimal. Even in the worse case, pruning runs on a background thread so streaming some hundred MB from the DB in the case of long non-finality does not sound too bad. Thoughts?
Using the fork-choice for pruning without holding its lock for too long may require persisting some prune helper data to the database. The desired order of operations is:
If the lock is not hold for this long and the node crashes during 3 then you lose track of pruned branches that will never be pruned. Another approach is to:
Then in a background thread, and in an atomic operation
Here you are sort of storing a HeadTracker but only for prune data that is decoupled from the fork-choice.
We can achieve better atomicity in the HeadTracker by persisting each head to the DB individually in separate keys. Heads are inserted in the HeadTracker in the existing atomic transaction during block import. Heads are pruned during the atomic operation of hot DB pruning. No action is necessary on shutdown. The pruning routine can stream all keys in that column range to iterate the HeadTracker.
This option allows to keep the pruning logic the same but remove the potential inconsistencies on shutdown.
I don't mind the brute-force approach, but it would only be viable with tree-states
. On stable
, all the blocks are stored in the hot DB; they never get migrated to the freezer (like states do). So most DBs have like 30GB of blocks in that column. On tree-states
, they get migrated, compressed and indexed by slot, so the hot DB only has ~64-256 on average when finality is happening (the 256 because we sometimes delay migration to avoid I/O).
I also quite like the pruning-summary & per-block head tracker. Maybe the pruning summary is the simplest for now? Or we wait until tree-states and do the brute force. I like the simplicity of not worrying about lock ordering.
More reasons to delete:
Description
The head tracker contains redundant information that is already present in the fork choice block DAG. This is undesirable as it means there isn't just "one source of truth", and the two data structures need to be kept in sync across concurrent executions, which is highly non-trivial (see #1771 for more)
Steps to resolve
HeadTracker
struct from the beacon chaindeleted
in fork choice, until they are cleaned up by fork choice's own pruning mechanism