huggingface / distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
MIT License
3.54k stars 280 forks source link

[training] add feature to save best models #141

Closed eustlb closed 2 months ago

eustlb commented 3 months ago

This PR adds the capability to save the best models during training.

A best model is determined based on the WER (Word Error Rate) computed on validation sets, which corresponds to the --eval_dataset_name flag. The --save_best_total_limit flag specifies the number of best models to save (defaults to 1). A best model is saved as a checkpoint, allowing training to be resumed from it. The saved models and checkpoint to resume training are named in the formatcheckpoint-2-epoch-0-val-wer-712.132, with 712.132 being the WER.

The model saved under the base directory (i.e. the one set with --output_dir flag) is the best one. The weights of the model at the end of training are saved under a end-of-training-weights directory.

egorsmkv commented 2 months ago

Nice PR, thx