EricFillion / happy-transformer

Happy Transformer makes it easy to fine-tune and perform inference with NLP Transformer models.
http://happytransformer.com
Apache License 2.0
516 stars 66 forks source link

Saving checkpoints #306

Closed dennymarcels closed 1 year ago

dennymarcels commented 1 year ago

I was wondering if I could easily save and persistent periodic checkpoints while training, and also resume training (from the checkpoints) in case my environment crashes.

EricFillion commented 1 year ago

Thanks for the suggestion Denny. I've considered adding this before. It'll involve passing a path to an output directory to TrainingArguments/Seq2SeqTrainingArguments instead of using a temp directory.

EricFillion commented 1 year ago

Now supported with version 3.0.0.

Pass a float between 0 and 1 to your TrainArgs's _savesteps parameter to specify the ratio of the number of steps that will occur before saving is done. For example, if set to 0.1 saving will occur ten times over the entire training process. By default _savesteps is set to 0 and no saving is performed. The checkpoints will be saved to the directory provided by TrainArgs's _outputdir parameter which by default is "happy_transformer/".

from happytransformer import GENTrainArgs

train_args = GENTrainArgs(save_steps=0.1)