mravanelli / pytorch-kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
2.37k stars 446 forks source link

Production mode save_outputs: getting bad .ark files #228

Closed matthewkperez closed 4 years ago

matthewkperez commented 4 years ago

Hello, I'm currently using the Librispeech dataset and have trained a model following the pytorch-kaldi tutorial. I'm trying to use this trained librispeech acoustic model to produce embeddings for a speech conversion task. To do this, I have created a separate cfg which I use to enter production mode. I feed in the new features for my speech conversion data and am saving the outputs of out_dnn1 (which is the last layer before output layer and what I am trying to use as embeddings). I am able to run the pytorch-kaldi production script successfully however the .ark files produced for out_dnn1 seem to be buggy. Running "Copy-feats" gives me an error after the first key. Error is below:

WARNING (copy-feats[5.5.671~1494-e5a5a]:Next():util/kaldi-table-inl.h:562) Invalid archive file format: expected space after key ��M>
ERROR (copy-feats[5.5.671~1494-e5a5a]:~SequentialTableReaderArchiveImpl():util/kaldi-table-inl.h:678) TableReader: error detected closing archive forward_dst_te_ep16_ck0_out_dnn1.ark

Attached is the config file being used for production and log.log log.txt libri_RNN_production.txt

Thanks!

matthewkperez commented 4 years ago

I think I found a workaround. I was able to properly produce .ark files when I changed the [forward] section of the libri_RNN_production.cfg to match that of the libri_RNN.cfg (used to train the initial model). More specifically, in the production cfg above I removed the out_dnn2 output in the [forward] section which resulted in an extra dimension [num_samples, 1, layer_out] as opposed to [num_samples, layer_out].

I'm posting the updated libri_RNN_production.cfg below so anyone can see the difference for themselves. libri_RNN_production2.txt