facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.48k stars 6.4k forks source link

Understanding the result of Wav2vec pretraining and finetuning #3646

Closed edosyhptra closed 3 years ago

edosyhptra commented 3 years ago

❓ Questions and Help

I have been pretrained wav2vec2 base model with my own dataset in Bahasa Indonesia for approximately 300 hours and got 56% valid accuracy. Now still running finetuning with around 50h dataset from the pretrained model and got valid_wer: 8.736 on epoch 37.

What is your question?

Im not sure whether my pretrained model is overfit. I have been searched for explanation for loss0, loss1, code_perplexity, etc to understand how good my pretrained model learned.

Given the pretrain and finetune log,

Pretrain log

Screen Shot 2021-06-25 at 14 24 15 Screen Shot 2021-06-25 at 14 26 50

[2021-06-21 23:36:35,217][train_inner][INFO] - {"epoch": 20, "update": 19.965, "loss": "2.551", "ntokens": "23692.9", "nsentences": "144.97", "prob_perplexity": "277.805", "code_perplexity": "223.408", "temp": "1.7", "loss_0": "2.449", "loss_1": "0.081", "loss_2": "0.022", "accuracy": "0.51487", "wps": "1905.2", "ups": "0.08", "wpb": "23692.9", "bsz": "145", "num_updates": "32600", "lr": "0.000485696", "gnorm": "0.552", "loss_scale": "0.0312", "train_wall": "1225", "wall": "486378"}
[2021-06-21 23:48:16,337][fairseq_cli.train][INFO] - begin validation on "valid" subset
[2021-06-21 23:54:58,480][valid][INFO] - {"epoch": 20, "valid_loss": "2.302", "valid_ntokens": "369.888", "valid_nsentences": "2.25888", "valid_prob_perplexity": "279.808", "valid_code_perplexity": "222.094", "valid_temp": "1.699", "valid_loss_0": "2.201", "valid_loss_1": "0.08", "valid_loss_2": "0.02", "valid_accuracy": "0.56713", "valid_wps": "5024.8", "valid_wpb": "369.9", "valid_bsz": "2.3", "valid_num_updates": "32657", "valid_best_loss": "2.302"}
[2021-06-21 23:54:58,481][fairseq_cli.train][INFO] - begin save checkpoint
[2021-06-21 23:54:58,482][fairseq.trainer][INFO] - Preparing to save checkpoint to checkpoints/checkpoint_best.pt after 32657 updates
[2021-06-21 23:55:07,606][fairseq.trainer][INFO] - Finished saving checkpoint to checkpoints/checkpoint_best.pt
[2021-06-21 23:55:15,579][fairseq.checkpoint_utils][INFO] - saved checkpoint checkpoints/checkpoint_best.pt (epoch 20 @ 32657 updates, score 2.302) (writing took 17.096878033014946 seconds)
[2021-06-21 23:55:15,579][fairseq_cli.train][INFO] - end of epoch 20 (average epoch stats below)
[2021-06-21 23:55:15,582][train][INFO] - {"epoch": 20, "train_loss": "2.562", "train_ntokens": "23683.6", "train_nsentences": "144.372", "train_prob_perplexity": "277.481", "train_code_perplexity": "222.651", "train_temp": "1.706", "train_loss_0": "2.46", "train_loss_1": "0.081", "train_loss_2": "0.021", "train_accuracy": "0.51284", "train_wps": "1857.6", "train_ups": "0.08", "train_wpb": "23683.6", "train_bsz": "144.4", "train_num_updates": "32657", "train_lr": "0.00048566", "train_gnorm": "0.579", "train_loss_scale": "0.0625", "train_train_wall": "20095", "train_wall": "487498"}

Finetune log

Screen Shot 2021-06-25 at 14 27 52 Screen Shot 2021-06-25 at 14 28 15

[2021-06-25 06:40:50,502][fairseq.trainer][INFO] - Preparing to save checkpoint to checkpoints/checkpoint_best.pt after 16186 updates
[2021-06-25 06:40:58,382][fairseq.trainer][INFO] - Finished saving checkpoint to checkpoints/checkpoint_best.pt
[2021-06-25 06:41:04,896][fairseq.checkpoint_utils][INFO] - saved checkpoint checkpoints/checkpoint_best.pt (epoch 37 @ 16186 updates, score 8.736) (writing took 14.396043311106041 seconds)
[2021-06-25 06:41:04,897][fairseq_cli.train][INFO] - end of epoch 37 (average epoch stats below)
[2021-06-25 06:41:04,900][train][INFO] - {"epoch": 37, "train_loss": "11.348", "train_ntokens": "3681.88", "train_nsentences": "140.872", "train_nll_loss": "0.434", "train_wps": "958.6", "train_ups": "0.26", "train_wpb": "3681.9", "train_bsz": "140.9", "train_num_updates": "16186", "train_lr": "3e-05", "train_gnorm": "26.309", "train_loss_scale": "16", "train_train_wall": "1571", "train_wall": "60749"}
[2021-06-25 06:41:04,920][fairseq.trainer][INFO] - begin training epoch 38
[2021-06-25 06:41:53,525][train_inner][INFO] - {"epoch": 38, "update": 37.032, "loss": "11.226", "ntokens": "3679.84", "nsentences": "140.815", "nll_loss": "0.43", "wps": "916", "ups": "0.25", "wpb": "3679.8", "bsz": "140.8", "num_updates": "16200", "lr": "3e-05", "gnorm": "26.197", "loss_scale": "16", "train_wall": "718", "wall": "60797"}
[2021-06-25 06:53:59,279][train_inner][INFO] - {"epoch": 38, "update": 37.489, "loss": "11.188", "ntokens": "3680.16", "nsentences": "140.615", "nll_loss": "0.427", "wps": "1014.2", "ups": "0.28", "wpb": "3680.2", "bsz": "140.6", "num_updates": "16400", "lr": "3e-05", "gnorm": "27.19", "loss_scale": "16", "train_wall": "709", "wall": "61523"}
[2021-06-25 07:06:46,663][train_inner][INFO] - {"epoch": 38, "update": 37.945, "loss": "10.967", "ntokens": "3683.52", "nsentences": "141.005", "nll_loss": "0.42", "wps": "960", "ups": "0.26", "wpb": "3683.5", "bsz": "141", "num_updates": "16600", "lr": "3e-05", "gnorm": "25.193", "loss_scale": "32", "train_wall": "753", "wall": "62291"}
[2021-06-25 07:08:17,701][fairseq_cli.train][INFO] - begin validation on "valid" subset
[2021-06-25 07:09:05,057][valid][INFO] - {"epoch": 38, "valid_loss": "6.43", "valid_ntokens": "155.374", "valid_nsentences": "5.84173", "valid_nll_loss": "0.242", "valid_uer": "2.769", "valid_wer": "8.314", "valid_raw_wer": "8.314", "valid_wps": "1830.8", "valid_wpb": "155.4", "valid_bsz": "5.8", "valid_num_updates": "16624", "valid_best_wer": "8.314"}
[2021-06-25 07:09:05,058][fairseq_cli.train][INFO] - begin save checkpoint
[2021-06-25 07:09:05,060][fairseq.trainer][INFO] - Preparing to save checkpoint to checkpoints/checkpoint_best.pt after 16624 updates
[2021-06-25 07:09:12,941][fairseq.trainer][INFO] - Finished saving checkpoint to checkpoints/checkpoint_best.pt
[2021-06-25 07:09:19,597][fairseq.checkpoint_utils][INFO] - saved checkpoint checkpoints/checkpoint_best.pt (epoch 38 @ 16624 updates, score 8.314) (writing took 14.53786907484755 seconds)
[2021-06-25 07:09:19,598][fairseq_cli.train][INFO] - end of epoch 38 (average epoch stats below)
[2021-06-25 07:09:19,598][fairseq_cli.train][INFO] - end of epoch 38 (average epoch stats below)
[2021-06-25 07:09:19,601][train][INFO] - {"epoch": 38, "train_loss": "11.082", "train_ntokens": "3681.88", "train_nsentences": "140.872", "train_nll_loss": "0.424", "train_wps": "951.6", "train_ups": "0.26", "train_wpb": "3681.9", "train_bsz": "140.9", "train_num_updates": "16624", "train_lr": "3e-05", "train_gnorm": "26.014", "train_loss_scale": "32", "train_train_wall": "1597", "train_wall": "62444"}
#### What have you tried? #### What's your environment? - fairseq Version (e.g., 1.0 or master): - PyTorch Version (e.g., 1.0): 1.9.0+cu102 - OS (e.g., Linux): Ubuntu 18.04 Server - How you installed fairseq (`pip`, source): pip3 install --editable ./ - Build command you used (if compiling from source): - Python version: 3.6.9 - CUDA/cuDNN version: 10.2 - GPU models and configuration: GeForce GTX 1080 Ti
medabalimi commented 3 years ago
  • if valid_wer for ex "8.314", is that means 8% of wer or 80%?

8% WER not 80%. Try using a LM should reduce the WER a bit.

medabalimi commented 3 years ago
  • Given the valid_wer and valid_raw_wer ( "valid_wer": "8.314", "valid_raw_wer": "8.314) , it occurs the same in every validation process. Is it because I am not using lm decoding on args?

Yes. raw_wer is the CTC wer before LM decoding. if you do not use an LM, wer=raw_wer

edosyhptra commented 3 years ago
  • if valid_wer for ex "8.314", is that means 8% of wer or 80%?

8% WER not 80%. Try using a LM should reduce the WER a bit.

I see. I will try it!

Thanks for the answer! really appreciate