https://github.com/NVIDIA/apex/pull/1704 introduced a bug where the distributed optimizer fails when loading old checkpoints in the deprecated v1 format. This PR includes the checkpoint format in the checkpoint. If the distributed optimizer can't find the format when loading a checkpoint, it falls back to v1. This should help with backwards compatibility if we change the format again in the future.
https://github.com/NVIDIA/apex/pull/1704 introduced a bug where the distributed optimizer fails when loading old checkpoints in the deprecated v1 format. This PR includes the checkpoint format in the checkpoint. If the distributed optimizer can't find the format when loading a checkpoint, it falls back to v1. This should help with backwards compatibility if we change the format again in the future.