Hi, I change the ckpt saving behavior by calling the save_model interface of official Trainer from hf-transformers, since this function will handle the case under different training framework including deepspeed zero3, where the model will be saved as pytorch.bin model for deepspeed (in old version of transformers) or collected state_dict (in new version of transformers)
Hi, I change the ckpt saving behavior by calling the
save_model
interface of official Trainer from hf-transformers, since this function will handle the case under different training framework including deepspeed zero3, where the model will be saved as pytorch.bin model for deepspeed (in old version of transformers) or collected state_dict (in new version of transformers)