Closed ITerydh closed 3 weeks ago
def _save_checkpoint(self, model, trial, metrics=None):
# In all cases, including ddp/dp/deepspeed, self.model is always a reference to the model we
# want to save except FullyShardedDDP.
# assert unwrap_model(model) is self.model, "internal model should be a reference to self.model"
# Save model checkpoint
checkpoint_folder = f"{PREFIX_CHECKPOINT_DIR}-{self.state.global_step}"
if self.hp_search_backend is None and trial is None:
self.store_flos()
run_dir = self._get_output_dir(trial=trial)
output_dir = os.path.join(run_dir, checkpoint_folder)
self.save_model(output_dir, _internal_call=True)
if not self.args.save_only_model:
# Save optimizer and scheduler
self._save_optimizer_and_scheduler(output_dir)
# Save RNG state
self._save_rng_state(output_dir)
This is from trainer.py so you could have a look at self._save_optimizer_and_scheduler(output_dir)
and self._save_rng_state(output_dir)
This is from trainer.py so you could have a look at
self._save_optimizer_and_scheduler(output_dir)
andself._save_rng_state(output_dir)
Where is this trainer.py? I didn't see it, thanks ~
System Info
Information
Tasks
no_trainer
script in theexamples
folder of thetransformers
repo (such asrun_no_trainer_glue.py
)Reproduction
I am using Deepspeed Zero2. I want to save the model state and optimizer state, but the current
save_pretrained()
only supports saving the model state. How can I save the optimizer state?Expected behavior
I would like to know if it supports saving optimizer state and how to use it.
THANKS!