idiap / pkwrap

A pytorch wrapper for LF-MMI training and parallel training in Kaldi
Other
72 stars 12 forks source link

Save Quantized Model #23

Closed jyp0716 closed 2 years ago

jyp0716 commented 2 years ago

Hi, i'm interested in “Quantization of Acoustic Model Parameters in Automatic Speech Recognition Framework”. Pkwrap has load_kaldi_model branch. I read and run this code. I followed this https://github.com/idiap/pkwrap/blob/load_kaldi_models/egs/librispeech/quant/README.md and run bash run_forward_pass_quantization.sh $KALDI_ROOT/egs/librispeech/s5, but I found that it would not save quantized model. How can I get quantized model?

amrutha-p commented 2 years ago

Hello,

When you run the script, it should create a folder in the path from where you run this script. The folder name is out_dir=decode7k${filename}_pytorch_qi8_1a where filename would be "test_clean_hires" if you haven't changed the $eval_data_dir .

In that folder you will find a dequantized.mdl which is re-constructed from the quantization of the model paramters. This is done as Kaldi doesn't support to write integers values for a model parameters. Hence, we do quantization of model parameters during inference.

Let me know if this helps.

jyp0716 commented 2 years ago

OK, I find a dequantized.mdl in that folder. That means if I want to save quantized model, I have to modify Kaldi code, right? Thanks.

amrutha-p commented 2 years ago

Yes. You will have to modify the source code of TDNN, Batchnorm etc to be able to save the integer values.

jyp0716 commented 2 years ago

Thank you! I'll try to modify the source code of Kaldi.