mailong25 / self-supervised-speech-recognition

speech to text with self-supervised learning based on wav2vec 2.0 framework
379 stars 114 forks source link

validate step in example log file but not when i'm running it #31

Closed TaridaGeorge closed 3 years ago

TaridaGeorge commented 3 years ago

In the examples folder there is a file called hydra_train_finetune.log. In that file after each epoch or before each epoch I can see that there's a validation step.

[2020-12-19 17:41:16,018][fairseq_cli.train][INFO] - begin validation on "valid" subset
[2020-12-19 17:41:22,153][valid][INFO] - {"epoch": 1, "valid_loss": "729.494", "valid_ntokens": "2521.93", "valid_nsentences": "49.1429", "valid_nll_loss": "14.215", "valid_uer": "123.012", "valid_wer": "100.393", "valid_raw_wer": "100.393", "valid_wps": "6759", "valid_wpb": "2521.9", "valid_bsz": "49.1", "valid_num_updates": "85"}
[2020-12-19 17:41:22,155][fairseq_cli.train][INFO] - begin save checkpoint
[2020-12-19 17:41:22,156][fairseq.trainer][INFO] - Preparing to save checkpoint to checkpoints/checkpoint_best.pt after 85 updates
[2020-12-19 17:41:24,604][fairseq.trainer][INFO] - Finished saving checkpoint to checkpoints/checkpoint_best.pt
[2020-12-19 17:41:25,543][fairseq.checkpoint_utils][INFO] - saved checkpoint checkpoints/checkpoint_best.pt (epoch 1 @ 85 updates, score 100.393) (writing took 3.387641379999991 seconds)
[2020-12-19 17:41:25,543][fairseq_cli.train][INFO] - end of epoch 1 (average epoch stats below)
[2020-12-19 17:41:25,566][train][INFO] - {"epoch": 1, "train_loss": "887.379", "train_ntokens": "49633.1",

What configs should I take into consideration in order to have that validate step run for my dataset? I'm running the finetune.py script with the config base_100h.yaml.

mailong25 commented 3 years ago

I should validate on the dev set after each epoch. If it does not happen, then try to decrease the dataset.validate_after_updates in the config file

TaridaGeorge commented 3 years ago

After some modifications I've got it worked. I cannot tell exactly what was wrong but my intuition is telling me that the problem was that the manifest files were badly created.