I am proposing a new callback mechanism that allows the user to have more control over how checkpoints are sharded. The purpose of this RFC is to publicize the design of this new checkpointing feature, which has been implemented (see tensorflow/python/checkpoint/sharding), but is open to comments and changes from the open source community.
This RFC will be open for comment until Monday, February 5th, 2024. cc @k-w-w @petrychenko
Checkpoint Sharding Callback
Objective
I am proposing a new callback mechanism that allows the user to have more control over how checkpoints are sharded. The purpose of this RFC is to publicize the design of this new checkpointing feature, which has been implemented (see tensorflow/python/checkpoint/sharding), but is open to comments and changes from the open source community.