Closed seungkyoon closed 9 months ago
This pull request was exported from Phabricator. Differential Revision: D49541229
This pull request was exported from Phabricator. Differential Revision: D49541229
This pull request was exported from Phabricator. Differential Revision: D49541229
This pull request has been merged in facebookresearch/d2go@279185539d40cb4847d7094c12d871f98147c9c0.
Summary: There should be barriers around FSDP checkpointing to ensure other ranks do not continue to training while rank 0 is still checkpointing
Also add log after checkpoint finishes
Differential Revision: D49541229