k2-fsa / snowfall

Moved to https://github.com/k2-fsa/icefall
Apache License 2.0
143 stars 42 forks source link

Add lm_scale to LM rescoring. #212

Closed csukuangfj closed 3 years ago

csukuangfj commented 3 years ago

Closes #211

@danpovey Please try this pull-request.

It requires a pull request I just created in k2 to clear the cache of an FAS if its scores are reassigned.

danpovey commented 3 years ago

Thanks! you can merge when you think it's ready. Then Desh can have another look at position-dependent phones with this in place. BTW, in graph.py, when we do HLG.lm_scores = HLG.scores.clone() .. this might not be 100% correct because these also contain the silence probability at this point; we should perhaps be doing this at the G.fst level. Of course, this depends on our precise rescoring method; but if we are subtracting the old lm_scores and composing with an n-gram FST, this would wrongly remove the silence probs.

aarora8 commented 3 years ago

Thank you so much, I will try to use it with position-dependent phones.

csukuangfj commented 3 years ago

we should perhaps be doing this at the G.fst level

Using G.lm_scores can indeed achieve a lower WER than that of HLG.lm_scores.


The following screenshots are the results applying lm_scale during rescoring. (The model was trained using only a subset of the training data, i.e., only 100 hours of training data)

Rescoring with whole lattice (HLG.lm_scores)

Screen Shot 2021-06-15 at 2 36 20 PM

It uses HLG.lm_scores. You can see that lm_scale helps to obtain a better WER 5.63 than the baseline 5.73 (for test-clean).

Rescoring with whole lattice (G.lm_scores)

Screen Shot 2021-06-15 at 2 36 28 PM

G.lm_scores can further improve WER (5.63 --> 5.56, test-clean)

Rescoring with n-best list (G.lm_scores, num_paths=100)

Screen Shot 2021-06-15 at 3 08 48 PM

Rescoring with n-best list (G.lm_scores, num_paths=500)

Screen Shot 2021-06-15 at 3 08 52 PM

The test commands are

./mmi_att_transformer_decode.py \
  --use-lm-rescoring=1 \
  --max-duration=200 \
  --is-espnet-structure=0 \
  --vgg-frontend=0

./mmi_att_transformer_decode.py \
  --use-lm-rescoring=1 \
  --num-paths=100 \
  --max-duration=200 \
  --is-espnet-structure=0 \
  --vgg-frontend=0

./mmi_att_transformer_decode.py \
  --use-lm-rescoring=1 \
  --num-paths=500 \
  --max-duration=200 \
  --is-espnet-structure=0 \
  --vgg-frontend=0
danpovey commented 3 years ago

Great!! You can merge when you think it's ready.