facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.48k stars 6.41k forks source link

how to choose HuBERT pretrained checkpoint #4907

Open Bang-sheng-Zhuo opened 1 year ago

Bang-sheng-Zhuo commented 1 year ago

What is your question?

I try to reproduce the results with 100h finetune data. But my final results is not so good, wer of test-clean/test-other=8.2300 / 17.3509

training details

First iteration

100 clusters with mfcc feature, follow hubert_base_librispeech.yaml config, trained with 6 gpus

training loss and correct rate

250k step: loss_m_0=3.329, correct_m_0=0.364995, correct_u_0=0.0493537 400k step: loss_m_0=3.144, correct_m_0=0.38992, correct_u_0=0.0888545 it's obvious that 400k's result better than 250k step, should I choose 400k step checkpoint or continue training for more step?

Second iteration

500 clusters with 6 layer transformer feature(use the 250k step checkpoint as described in the paper), trained with 8 gpus

training loss and correct rate

250k step: loss_m_0=3.712, correct_m_0=0.382826, correct_u_0=0.635071 400k step: loss_m_0=3.422, correct_m_0=0.417456, correct_u_0=0.662054 is loss_m_0=3.422 good enough?

Finetune

follow base_10h.yaml, trained with 8 gpus, stop at 216000 updates(epoch 600)

Decode

viterbi decode, select the 100k step(280 epoch) checkpoint, wer result is test-clean/test-other=8.2300 / 17.3509

What have you tried?

What's your environment?

yangsuxia commented 1 year ago

Hi, what is the final loss value when you are in finetune(10h data). Looking forward to your reply, thank you.