@Stebalien identified two main problems that migrations of sector infos will face in the near future with post-hyperdrive growth and we talked through some solutions
Problems
Redoing a full sectors AMT migration if the root cid does not match (the current approach) will be too slow
Reasonable memory requirements will be exceeded during migration with the current approach of migrating the state tree and then flushing to disk after the migration is complete
Addressing first problem
Unfortunately we can't use the sector numbers bitfield to help with (1) because this does not identify deletions from the sectors amt. These essentially never happen but are not disallowed. However the recent Diff function over AMTs should work fine as long as the sectors amt bitwidth is not changed.
Addressing second problem
To address 2 the idea is to introduce memory management methods in the migration function. The blockstore-like object passed to the migration would have a Flush(c cid.Cid) method which writes blocks below the root cid c to disk and then removes them from memory. The first problem to solve is when the total amount of actor state, i.e. sector infos, does not fit in memory. In the case state is spread over enough miners such that each miner's state tree can fit in memory, this can be solved by calling Flush on the original (to clear from memory) and migrated (to persist to disk and clear from memory) actor Head cid after migrating that actor.
The next problem to solve is the case where a single miner's sector tree does not fit in memory. In this case the miner migration function will need to
have a mechanism for keeping track of memory in use either by heuristics on AMT inserts and writes or by asking the go runtime
In the event memory usage gets too high flush the sectors AMT and call Flush(sectorsRoot).
There will be some annoying writing of unneeded intermediate nodes and removal from memory of intermediate nodes needed immediately after flush to continue the AMT traversal but it should work well enough.
Possible protocol changes to make things easier
We could do some kind of tracking of deleted sectors to make the bitfield approach to diffing sectors AMTs between premigrations work. Probably not worth the tradeoff since diffing is written and should be fast
If we constrain the max number of sectors a miner actor can have then we could make it so we never have to address the scenario where a single miner's sectors can't fit into memory. There already is a limit but I believe we will need to make it lower to achieve this.
@Stebalien identified two main problems that migrations of sector infos will face in the near future with post-hyperdrive growth and we talked through some solutions
Problems
Addressing first problem
Unfortunately we can't use the sector numbers bitfield to help with (1) because this does not identify deletions from the sectors amt. These essentially never happen but are not disallowed. However the recent Diff function over AMTs should work fine as long as the sectors amt bitwidth is not changed.
Addressing second problem
To address 2 the idea is to introduce memory management methods in the migration function. The blockstore-like object passed to the migration would have a
Flush(c cid.Cid)
method which writes blocks below the root cidc
to disk and then removes them from memory. The first problem to solve is when the total amount of actor state, i.e. sector infos, does not fit in memory. In the case state is spread over enough miners such that each miner's state tree can fit in memory, this can be solved by callingFlush
on the original (to clear from memory) and migrated (to persist to disk and clear from memory) actor Head cid after migrating that actor.The next problem to solve is the case where a single miner's sector tree does not fit in memory. In this case the miner migration function will need to
Flush(sectorsRoot)
.There will be some annoying writing of unneeded intermediate nodes and removal from memory of intermediate nodes needed immediately after flush to continue the AMT traversal but it should work well enough.
Possible protocol changes to make things easier