Generate top k hypothesis for CTC decoding (infer.py)

❓ Questions and Help

What is your question?

I am running evaluation for a phone recognition model using Wav2Vec + CTC decoding (runnin /speech_recognition/infer.py ). I am running using the follow code and I have got the reference and 1-best hypothesis like following. My question is, is there a way to also generate top-k hypothesis (HYPO1, HYPO2, ...HYPOK)? Any pointers are appreciated! Thanks

2021-12-21 05:08:22 | INFO | __main__ | HYPO:sil ay hh ae d f ey t sp ih n d eh m sp
2021-12-21 05:08:22 | INFO | __main__ | TARGET:sil ay hh ae r d f ey t sil ih n d eh m sil

Code

python3 /home/ec2-user/SageMaker/PronunciationEvaluation/fairseq/examples/speech_recognition/infer.py $DATASET --task audio_finetuning \--nbest 5 --path $CKPT --gen-subset test --results-path /home/ec2-user/SageMaker/PronunciationEvaluation/wav2vec2mdd/result --w2l-decoder viterbi \
--lm-weight 0 --word-score -1 --sil-weight 0 --criterion ctc --labels phn --max-tokens 640000

What's your environment?

fairseq Version: 0.10.2
PyTorch Version: 1.7.1
OS (e.g. Linux): Ubuntu 18.04
How you installed fairseq (pip, source): pip install --editable ./
Build command you used (if compiling from source):
Python version: 3.8.2
CUDA/cuDNN version: 11.2
GPU models and configuration: Nvidia Tesla V100
Any other relevant information: Built flashlight python bindings from source

facebookresearch / fairseq

Generate top k hypothesis for CTC decoding (infer.py) #4086

❓ Questions and Help

What is your question?

Code

What's your environment?