pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
2.37k
stars
446
forks
source link
bad forward .ark file output when out model is a sequential model #235
I ran into an issue when I wanted to decode the outputs of a sequential model. Kaldi struggles to open the written .ark files for decoding throwing the following message:
ERROR (latgen-faster-mapped-parallel[5.5.646~1-cdf2]:DecodableMatrixScaledMapped():decoder/decodable-matrix.h:55) DecodableMatrixScaledMapped: mismatch, matrix has 1 cols but transition-model has 1992 pdf-ids.
where the out_save array still contains the singleton dimension from the batchsize of the forwarded sequence model. Usually this is not the default, as all PyTorch-Kaldi models implement e.g. the softmax function as an additional own model. In my case softmax is included into the model class of the sequential model and this is where it goes wrong.
I just included a small fix where I squeeze out the redundant dimension from out_save with np.squeeze but this has to be tested before I can make a pull request.
I ran into an issue when I wanted to decode the outputs of a sequential model. Kaldi struggles to open the written .ark files for decoding throwing the following message:
So I traced it down to the following line in the core.py
https://github.com/mravanelli/pytorch-kaldi/blob/775f5dbbf142fb1c1a56604ee603d426ca73a51f/core.py#L663
where the
out_save
array still contains the singleton dimension from the batchsize of the forwarded sequence model. Usually this is not the default, as all PyTorch-Kaldi models implement e.g. the softmax function as an additional own model. In my case softmax is included into the model class of the sequential model and this is where it goes wrong.I just included a small fix where I squeeze out the redundant dimension from
out_save
with np.squeeze but this has to be tested before I can make a pull request.