Closed alexandercesarr closed 2 years ago
That script only supports character based models, specifically models that support quantization nodes in their code such as QuartzNet, Jasper, Citrinet. Conformer does not support this.
Thanks @titu1994 for your reply. So if I want to quantize this conformer CTC BPE model, how can I do it? What is your recommendation for this one?
We don't support it, so there's no recommendations as such. We ourselves have not tried it. @Slyne fyi
Hi @alexandercesarr
First of all, if you want to deploy your models on GPUs. You may keep reading the below text.
There are two types of quantization that Tensorrt can support, explicit vs implicit. Check their difference here.
Where to get start ? Try Post Training Quantization (PTQ) first if you can export the conformer onnx model. Then check this simple example. All you need to do is to add a calibrator (to add data loader and feed the real data for the model to better calibrate).
What if the PTQ model can not meet the accuracy requirement after quantization ?
You may try QAT. Please follow pytorch_quantization on a resnet example.
@andi4191 also changed the codes and incorporate some codes to show how to do explicit quantization on NeMo conformer. Check here
Hi @Slyne Thank you very much for your great advice and help. I'll try it. But if I want to deploy my model on CPU, what should I do? It's same the above comment or not?
Hi @Slyne Thank you very much for your great advice and help. I'll try it. But if I want to deploy my model on CPU, what should I do? It's same the above comment or not?
If it's CPU, then you can check pytorch native quantization pipeline and tutorial to get start.
Thank you @Slyne for your help.
Hi, I trained a conformer model before. Now I wanna quantize that model and convert it to TRT. But when I run the speech_to_text_quant_infer_trt.py an error occur. The error is:
TypeError: Error instantiating 'nemo.collections.asr.modules.conformer_encoder.ConformerEncoder' : __init__() got an unexpected keyword argument 'quantize'
Could you please help me to solve it?