Closed fegin closed 1 month ago
Stack from ghstack (oldest at bottom):
Summary: This PR implements 2 different async checkpoint. The first one is to use DCP.async_save another one is to use pinned memory + a seperate process to avoid GILs issue.
It would be good to add an integration test for async checkpoint cc: @fegin
Stack from ghstack (oldest at bottom):
Summary: This PR implements 2 different async checkpoint. The first one is to use DCP.async_save another one is to use pinned memory + a seperate process to avoid GILs issue.