k2-fsa / kaldi-decoder

Decoders from Kaldi using OpenFst
Apache License 2.0
22 stars 3 forks source link

Can not decode properly with conformer_ctc_en #9

Open safarisadegh opened 5 months ago

safarisadegh commented 5 months ago

Hi, I have tried to use your package to decode audio files with provided pre-trained model in icefall-asr-librispeech-conformer-ctc-jit-bpe-500- but sometimes i get error and sometimes the decoding result is wrong.

This is the procedure I have followed: first, i have converted HLG.pt to HLG.fst by running python3 convert-k2-to-openfst.py --olabels aux_labels ./lang_bpe_500/HLG.pt ./lang_bpe_500/HLG.fst

then i used to decodet test_wavs using following command: python3 decode_with_HLG.py --nn-model ./lang_bpe_500/cpu_jit.pt --HLG ./lang_bpe_500/HLG.fst --words ./lang_bpe_500/words.txt test_wavs_1089-134686-0001.wav test_wavs_1221-135766-0001.wav test_wavs_1221-135766-0002.wav

finally, this is the output:

2024-02-14 10:25:54,563 INFO [decode_with_HLG.py:175] device: cpu
2024-02-14 10:25:54,563 INFO [decode_with_HLG.py:177] Loading torchscript model
2024-02-14 10:25:55,153 INFO [decode_with_HLG.py:182] Loading HLG from ./lang_bpe_500/HLG.fst
2024-02-14 10:25:58,930 INFO [decode_with_HLG.py:187] Constructing Fbank computer
2024-02-14 10:25:58,930 INFO [decode_with_HLG.py:198] Reading sound files: ['test_wavs_1089-134686-0001.wav', 'test_wavs_1221-135766-0001.wav', 'test_wavs_1221-135766-0002.wav']
2024-02-14 10:25:58,991 INFO [decode_with_HLG.py:204] Decoding started
2024-02-14 10:26:03,312 INFO [decode_with_HLG.py:139] test_wavs_1089-134686-0001.wav, torch.Size([165, 500])
2024-02-14 10:26:03,484 INFO [decode_with_HLG.py:139] test_wavs_1221-135766-0001.wav, torch.Size([417, 500])
2024-02-14 10:26:03,957 INFO [decode_with_HLG.py:147] failed to decode test_wavs_1221-135766-0001.wav
2024-02-14 10:26:03,959 INFO [decode_with_HLG.py:139] test_wavs_1221-135766-0002.wav, torch.Size([120, 500])
2024-02-14 10:26:04,116 INFO [decode_with_HLG.py:147] failed to decode test_wavs_1221-135766-0002.wav
2024-02-14 10:26:04,119 INFO [decode_with_HLG.py:235] 
test_wavs_1089-134686-0001.wav:
OLD SNIPER ATE WITH BE THAT NOTHING CUP P AAGE PET WERE HE THE PROUD MISSUS SPENCER GIED HOD ADAR LIFE SEN S LIFEBOAT AT THE TO AGNE AND HA THE

test_wavs_1221-135766-0001.wav:

test_wavs_1221-135766-0002.wav:

2024-02-14 10:26:04,119 INFO [decode_with_HLG.py:237] Decoding Done

What is wrong? Thanks for your contributions. @csukuangfj

csukuangfj commented 5 months ago

could you upload the test waves?

safarisadegh commented 5 months ago

test_wavs_1.zip (downloaded from this link )

@csukuangfj

safarisadegh commented 4 months ago

did you test the files? @csukuangfj

csukuangfj commented 4 months ago

Sorry, I missed it.

Could you tell me where you get decode_with_HLG.py?

This mode https://huggingface.co/csukuangfj/icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/tree/main is from https://github.com/k2-fsa/icefall/tree/master/egs/aishell/ASR/conformer_ctc and there is no decode_with_HLG.py inside the directory conformer_ctc.

If you write decode_with_HLG.py yourself, would you mind sharing it with us?

safarisadegh commented 4 months ago

I got decode_with_HLG.py from links you have provided in Readme : https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/conformer_ctc/jit_pretrained_decode_with_HLG.py This is the file i have used: decode_with_HLG.zip

@csukuangfj

csukuangfj commented 3 months ago

This is the procedure I have followed: first, i have converted HLG.pt to HLG.fst by running python3 convert-k2-to-openfst.py --olabels aux_labels ./lang_bpe_500/HLG.pt ./lang_bpe_500/HLG.fst

Please generate HLG.fst with the following script https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/local/prepare_lang_fst.py