mravanelli / pytorch-kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
2.37k stars 446 forks source link

bad forward .ark file output when out model is a sequential model #235

Open timolohrenz opened 4 years ago

timolohrenz commented 4 years ago

I ran into an issue when I wanted to decode the outputs of a sequential model. Kaldi struggles to open the written .ark files for decoding throwing the following message:

ERROR (latgen-faster-mapped-parallel[5.5.646~1-cdf2]:DecodableMatrixScaledMapped():decoder/decodable-matrix.h:55) DecodableMatrixScaledMapped: mismatch, matrix has 1 cols but transition-model has 1992 pdf-ids.

So I traced it down to the following line in the core.py
https://github.com/mravanelli/pytorch-kaldi/blob/775f5dbbf142fb1c1a56604ee603d426ca73a51f/core.py#L663

where the out_save array still contains the singleton dimension from the batchsize of the forwarded sequence model. Usually this is not the default, as all PyTorch-Kaldi models implement e.g. the softmax function as an additional own model. In my case softmax is included into the model class of the sequential model and this is where it goes wrong.

I just included a small fix where I squeeze out the redundant dimension from out_save with np.squeeze but this has to be tested before I can make a pull request.