Closed LorenzoSun-V closed 2 years ago
Hi @LorenzoSun-V , This reid model in V2.5 is quantized by QAT(Quantized-aware-training). The script of the whole QAT process is not released. The released script is about testing.
Thanks @wenjingk-xilinx! Do you have any plan about releasing the script of using QAT? If so, would you please disclose when will the script be released?
Hi @LorenzoSun-V , we will release the qat scripts later.
Hi @wenjingk-xilinx, I just found the QAT scripts have been released. I'll try the lastest API. Thanks!
Hi @wenjingk-xilinx, I just took a glance of the released QAT code. Here are some qusetions about workflow confusing me:
Can the whole flow be summarized as: (1) train the float model, the model.py has not been modified by QAT rules as follows: However, the model.py can be like:
(2) prune the float model and generate a slim model (optional).
(3) quantize the (slim) model, and modify the model.py by QAT rules as follows:
Since the Add, Cat, ReLU, QuantStub and DeQuantStub have no weights in the ckpt of trained float model, so the model modified by model.py used in QAT is able to load the ckpt correctly. Can I understand in this way? Because in the released YOLOX project, the float model also contains the ops such as Add, Cat, ReLU, QuantStub and DeQuantStub and then I met some issues on pruning. For more details, you can refer #966 .
You shoud (1) prune the float model and generate a slim model (optional). (2) the model.py has not been modified by QAT rules as follows (3) quantize the (slim) model, and modify the model.py by QAT rules as follows
Pruning and QAT are fully decoupled. They can't be done together. Canceling any operation about QAT, then doing pruning. When you have finished pruning, then (3) quantize the (slim) model, and modify the model.py by QAT rules as follows
If you have any confusion, please contact me.
Thanks for your patient reply.
Since backbone of my ReID model is ResNet50 and the ReID dataset is Market1501, it's much easier and faster to get the results. I'll work on ReID project to trial the whole workflow first.
I'll contact you if I meet confusing bugs.
Hi @DehuaTang, I've tried to use new pruning&quantizing APIs to deploy my model and got the xmodel successfully. But there is loss of metrics on VCK5000 compared to on GPU. I use one_step pruning and QAT to get the xmodel. Here is my workflow.
My original ReID float model's metrics on Market1501 are:
After searching subnets with sparsity=0.5 by one_step_pruner, I retrained the best subnet and its metrics are: Even higher than my original pth.
Then I use QAT to deploy the slim model(modify and insert a lot of operators in model.py) and got metrics on GPU:
However, I got lower metrics after I run generated xmodel to extract features of query and gallery images:
The details of prototxt are: The reid_param is due to this reply.
Do you have any ideas about the loss of metrics?
I think it's the difference in the implementation of pre - post processing between python and c++ code that causes this problem. You can check it !
Hi, @DehuaTang .
I've used the offical released reid xmodel and slim model with sparsity=0.5. I compiled the xmodel and extracted the last bn params into the prototxt. And I test it by my c++ code and use python code to evaluate the metrics, got: Which are the same as released README:
This means my pre-post processing code should be correct. Here is my QAT code:
Does this can export correct xmodel?
deployable_model = qat_processor.deployable_model(config.QAT_OUTPUT_DIR, used_for_xmodel=True)
tmp_input = torch.randn([1, 3, 128, 256], dtype=torch.float32)
tmp_output = deployable_model(tmp_input)
qat_processor.export_xmodel(config.QAT_OUTPUT_DIR, deploy_check=True)
Try this !
deployable_model = qat_processor.deployable_model(config.QAT_OUTPUT_DIR, used_for_xmodel=True) tmp_input = torch.randn([1, 3, 128, 256], dtype=torch.float32) tmp_output = deployable_model(tmp_input) qat_processor.export_xmodel(config.QAT_OUTPUT_DIR, deploy_check=True)
Try this !
Maybe
tmp_input = torch.randn([1, 3, 128, 256], dtype=torch.float32)
shoule be
tmp_input = torch.randn([1, 3, 256, 128], dtype=torch.float32)
?
Here is just an example, actually depends on your needs
Thanks, I'll try this. :full_moon_with_face:
I'm here again. :sweat_smile:
I've tried the code above, and it didn't work. But an idea just occurred into my mind: the bn_params in the prototxt are used to load the sole bn layer params during post processing, but I've trained the model by QAT which means the params of that sole bn layer have been changed. So I extract the sole bn params in deployable.pth(export by QAT) and replace it into prototxt, got:
The metrics are better than before, but still has a little loss compared to those on GPU. I'll try to debug it sometime, but now it seems to be OK. Thanks @DehuaTang .
I think this error is acceptable. python and c++ versions of the preprocessing interpolation method may have errors.
OK, thanks.
Hi, @DehuaTang .
I've used the offical released reid xmodel and slim model with sparsity=0.5. I compiled the xmodel and extracted the last bn params into the prototxt. And I test it by my c++ code and use python code to evaluate the metrics, got: Which are the same as released README:
This means my pre-post processing code should be correct. Here is my QAT code:
Does this can export correct xmodel?
Hello, can you please share with me the QAT script that you have used, here is my email: ikrame.bgr@gmail.com. I am having an error with qat_processor. Here is the error I am having
Hi, @IkrameBeggar I just commit the repo, you can refer this script. But it's based on a former version of VitisAI. Hope it's helpful.
Hi, @IkrameBeggar I just commit the repo, you can refer this script. But it's based on a former version of VitisAI. Hope it's helpful.
Thank you very much. I really appreciate it
In Vitis-AI2.0, we first used mode='calib' to export quant config and then used mode='test' to dump xmodel. But in Vitis-AI2.5, the released quantizing code is quite far away from that in 2.0, the quantizing modes are 'train', 'deploy' and 'test' respectively. So what's the order of exporting xmodel?
The released script is used to test quantizing mode: Should I change mode to 'train' first and then change mode to 'deploy' to get xmodel if I want to acquire a quantizing model?
Looking forward to hearing from you.