NVIDIA / OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
https://nvidia.github.io/OpenSeq2Seq
Apache License 2.0
1.54k stars 372 forks source link

KeyError when running scripts/decode.py #509

Closed BoneGoat closed 4 years ago

BoneGoat commented 4 years ago

I'm trying to get the beam search decoder to work for my trained model. I have successfully runned inference on audio with the greedy decoder so I know the model works.

First I ran: python run.py --mode=infer --config="MODEL_CONFIG" --logdir="MODEL_CHECKPOINT_DIR" --num_gpus=1 --batch_size_per_gpu=1 --decoder_params/use_language_model=False --infer_output_file=model_output.pickle

which worked and I got the pickle file. Then I'm trying to run: python scripts/decode.py --logits=model_output.pickle --labels="CSV_FILE" --lm="LM_BINARY" --vocab="ALPHABET_FILE" --alpha=ALPHA --beta=BETA --beam_width=BEAM_WIDTH

but then I get: Traceback (most recent call last): File "scripts/decode.py", line 202, in probs_batch.append(softmax(logits[audio_filename])) KeyError: '/'

/ is the first char of the audio filename in the CSV file which looks like this: wav_filename,transcript /root/dev/ekot.wav,bla

If I change the filename to "123.wav" it will KeyError on "1". Change it to "test.wav" it will KeyError on "t" and so on.

Best regards

BoneGoat commented 4 years ago

Solved it by changing: audio_filename = line[0] into audio_filename = line and logits[audio_filename] into logits.get(audio_filename)

I may have different Numpy version installed.

swapnil3597 commented 4 years ago

Solved it by changing: audio_filename = line[0] into audio_filename = line and logits[audio_filename] into logits.get(audio_filename)

I may have different Numpy version installed.

I'm trying to get the beam search decoder to work for my trained model. I have successfully runned inference on audio with the greedy decoder so I know the model works.

First I ran: python run.py --mode=infer --config="MODEL_CONFIG" --logdir="MODEL_CHECKPOINT_DIR" --num_gpus=1 --batch_size_per_gpu=1 --decoder_params/use_language_model=False --infer_output_file=model_output.pickle

which worked and I got the pickle file. Then I'm trying to run: python scripts/decode.py --logits=model_output.pickle --labels="CSV_FILE" --lm="LM_BINARY" --vocab="ALPHABET_FILE" --alpha=ALPHA --beta=BETA --beam_width=BEAM_WIDTH

but then I get: Traceback (most recent call last): File "scripts/decode.py", line 202, in probs_batch.append(softmax(logits[audio_filename])) KeyError: '/'

/ is the first char of the audio filename in the CSV file which looks like this: wav_filename,transcript /root/dev/ekot.wav,bla

If I change the filename to "123.wav" it will KeyError on "1". Change it to "test.wav" it will KeyError on "t" and so on.

Best regards

From where did you download LM_BINARY file?