filecoin-project / core-devs

Technical Project Management: Meeting notes and agenda items
32 stars 11 forks source link

Revision to the nv22 timeline #166

Closed rjan90 closed 7 months ago

rjan90 commented 7 months ago

Background for the revision

During reviews of the migration code for network version 22 it became apparent that the complexitites in the migration for Direct Data Onboarding needs a extra revision to land with confidence and correctness.

More specifically, the issue is ensuring that cached migrations during re-orgs are handled correctly that are complex in this migration. Unfortunately, relying solely on a non-cached migration, which is a lot more straightforward, isn't viable given that the benchmarks for non-cached migration are around 10 minutes and would not be acceptable.

Tentative proposal for revised timeline

The implementer teams are currently aligning on a new proposed timeline, which would revise the timelines as follow:

Forest and Lotus has currently aligned that these dates sound okay (link to dicsussion in #fil-implementers thread), but we are awaiting a final okay from Venus. Please use the above timeline as a heads up and guidance for now. I expect that we will land on a final proposal to revise the timeline no later then 2024-02-19 - 13:00:00Z.

jennijuju commented 7 months ago

Thanks @rjan90!

lemmih commented 7 months ago

It is worth calling out that the network snapshot service must have lotus & forest state validation check implemented before this upgrade to ensure a healthy chain snapshot that is delivered to the users post upgrade.

Could you elaborate on these state validation checks? Are these general sanity checks, or are they also NV22-specific?

jennijuju commented 7 months ago

It is worth calling out that the network snapshot service must have lotus & forest state validation check implemented before this upgrade to ensure a healthy chain snapshot that is delivered to the users post upgrade.

Could you elaborate on these state validation checks? Are these general sanity checks, or are they also NV22-specific?

I think in general snapshot services should implement state checks across lotus and forest nodes before publishing a snapshot. And I think it’s even more critical for post upgrade snapshots given the chain could be more reorg-y / nodes are more prune to state mismatch post a heavy migration.

luckyparadise commented 7 months ago

Can I close this issue @rjan90 ? It appears we have a consensus on this matter now.