idiap / pkwrap

A pytorch wrapper for LF-MMI training and parallel training in Kaldi
Other
72 stars 12 forks source link

Quantization #24

Closed sparro12 closed 2 years ago

sparro12 commented 2 years ago

In trying to run the quantization, all I have is the model at the moment.

However, the run_forward_pass_quantization.sh seems to require more than this. Would these other folders like the exp_dir come after running training?

Also, the path.sh and utils/parse_options.sh don't seem to exist in the load_kaldi_models branch under the quant folder which means the script errors out earlier. I copied the quant folder over to the master branch and put the script along side the librispeech/v1 folder but the utils/parse_options.sh still does not exist.

Could you give some quick guidance on how to do a test run of this?

We've tried Kaldi quantization with ONNX, but after converting with the kaldi-onnx repository, the model can't seem to be quantized due to a operation (ReplaceIndex) not existing in the native ONNX framework. So, therefore, this is our next best option due to the need to run the model on the NPU in quantized format. The model will have to be converted back to ONNX afterward because the NPU only supports TensorFlow Lite, ONNX, Arm NN, and DeepViewRT.

amrutha-p commented 2 years ago

Hello,

The run_forward_pass_quantization.sh scripts performs quantization of a kaldi model and decodes it. It doesn't train anything.

In order to run this script, you need to have a exp folder with the model (ex: final.mdl), graph_dir and 0.trans_mdl. For e.g.: if you run

bash run_forward_pass_quantization.sh $KALDI_ROOT/egs/librispeech/s5

The script expects the model_dir to be the $KALDI_ROOT/egs/librispeech/s5/exp/chain_cleaned/tdnn_7k_1a_sp You can change the _7k_1a_sp using the --affix option while running the script. The models and graph should be inside the model_dir.

For the path.sh and utils/parse_options.sh, you can link the kaldi path script and utils folder you use to the librispeech/quant path. I haven't added these scripts here because its the standard kaldi path and utils folder. I will mention this in the README.

Please let me know if this helps.

groncarolo commented 2 years ago

Hello @amrutha-p,

I would be interested to experiment with pkwrap quantization too. To have a clean start I am trying to train librispeech model but it is taking literally ages. Is is possible to use a pre-trained model as available here https://kaldi-asr.org/models/m13 ?

Thanks a lot!

guido

sparro12 commented 2 years ago

@amrutha-p Sorry I'm finally back to working on this.

Yes this does help. It sounds like I need to train a model first and then use the quantization script.

sparro12 commented 2 years ago

I was able to solve this by obtaining the folder for the model post-training. Thank you