Xilinx / Vitis-AI

Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
https://www.xilinx.com/ai
Apache License 2.0
1.47k stars 630 forks source link

quant_mode in pt_personreid-res50_market1501_256_128_5.3G_2.5? #912

Closed LorenzoSun-V closed 2 years ago

LorenzoSun-V commented 2 years ago

In Vitis-AI2.0, we first used mode='calib' to export quant config and then used mode='test' to dump xmodel. But in Vitis-AI2.5, the released quantizing code is quite far away from that in 2.0, the quantizing modes are 'train', 'deploy' and 'test' respectively. So what's the order of exporting xmodel?

The released script is used to test quantizing mode: image Should I change mode to 'train' first and then change mode to 'deploy' to get xmodel if I want to acquire a quantizing model?

Looking forward to hearing from you.

wenjingk-xilinx commented 2 years ago

Hi @LorenzoSun-V , This reid model in V2.5 is quantized by QAT(Quantized-aware-training). The script of the whole QAT process is not released. The released script is about testing.

LorenzoSun-V commented 2 years ago

Thanks @wenjingk-xilinx! Do you have any plan about releasing the script of using QAT? If so, would you please disclose when will the script be released?

wenjingk-xilinx commented 2 years ago

Hi @LorenzoSun-V , we will release the qat scripts later.

LorenzoSun-V commented 2 years ago

Hi @wenjingk-xilinx, I just found the QAT scripts have been released. I'll try the lastest API. Thanks!

LorenzoSun-V commented 2 years ago

Hi @wenjingk-xilinx, I just took a glance of the released QAT code. Here are some qusetions about workflow confusing me:

  1. Can the whole flow be summarized as: (1) train the float model, the model.py has not been modified by QAT rules as follows: image image However, the model.py can be like: image

    (2) prune the float model and generate a slim model (optional).

    (3) quantize the (slim) model, and modify the model.py by QAT rules as follows: image

LorenzoSun-V commented 2 years ago

Since the Add, Cat, ReLU, QuantStub and DeQuantStub have no weights in the ckpt of trained float model, so the model modified by model.py used in QAT is able to load the ckpt correctly. Can I understand in this way? Because in the released YOLOX project, the float model also contains the ops such as Add, Cat, ReLU, QuantStub and DeQuantStub and then I met some issues on pruning. For more details, you can refer #966 .

DehuaTang commented 2 years ago

You shoud (1) prune the float model and generate a slim model (optional). (2) the model.py has not been modified by QAT rules as follows (3) quantize the (slim) model, and modify the model.py by QAT rules as follows

DehuaTang commented 2 years ago

Pruning and QAT are fully decoupled. They can't be done together. Canceling any operation about QAT, then doing pruning. When you have finished pruning, then (3) quantize the (slim) model, and modify the model.py by QAT rules as follows

If you have any confusion, please contact me.

LorenzoSun-V commented 2 years ago

Thanks for your patient reply.

Since backbone of my ReID model is ResNet50 and the ReID dataset is Market1501, it's much easier and faster to get the results. I'll work on ReID project to trial the whole workflow first.

I'll contact you if I meet confusing bugs.

LorenzoSun-V commented 2 years ago

Hi @DehuaTang, I've tried to use new pruning&quantizing APIs to deploy my model and got the xmodel successfully. But there is loss of metrics on VCK5000 compared to on GPU. I use one_step pruning and QAT to get the xmodel. Here is my workflow.

My original ReID float model's metrics on Market1501 are: image

After searching subnets with sparsity=0.5 by one_step_pruner, I retrained the best subnet and its metrics are: image Even higher than my original pth.

Then I use QAT to deploy the slim model(modify and insert a lot of operators in model.py) and got metrics on GPU: image

However, I got lower metrics after I run generated xmodel to extract features of query and gallery images: image

The details of prototxt are: image The reid_param is due to this reply.

Do you have any ideas about the loss of metrics?

DehuaTang commented 2 years ago

I think it's the difference in the implementation of pre - post processing between python and c++ code that causes this problem. You can check it !

LorenzoSun-V commented 2 years ago

Hi, @DehuaTang .

I've used the offical released reid xmodel and slim model with sparsity=0.5. I compiled the xmodel and extracted the last bn params into the prototxt. And I test it by my c++ code and use python code to evaluate the metrics, got: image Which are the same as released README: image

This means my pre-post processing code should be correct. Here is my QAT code: image

Does this can export correct xmodel?

DehuaTang commented 2 years ago
    deployable_model = qat_processor.deployable_model(config.QAT_OUTPUT_DIR, used_for_xmodel=True)
    tmp_input = torch.randn([1, 3, 128, 256], dtype=torch.float32)
    tmp_output = deployable_model(tmp_input)
    qat_processor.export_xmodel(config.QAT_OUTPUT_DIR, deploy_check=True)

Try this !

LorenzoSun-V commented 2 years ago
    deployable_model = qat_processor.deployable_model(config.QAT_OUTPUT_DIR, used_for_xmodel=True)
    tmp_input = torch.randn([1, 3, 128, 256], dtype=torch.float32)
    tmp_output = deployable_model(tmp_input)
    qat_processor.export_xmodel(config.QAT_OUTPUT_DIR, deploy_check=True)

Try this !

Maybe tmp_input = torch.randn([1, 3, 128, 256], dtype=torch.float32) shoule be tmp_input = torch.randn([1, 3, 256, 128], dtype=torch.float32) ?

DehuaTang commented 2 years ago

Here is just an example, actually depends on your needs

LorenzoSun-V commented 2 years ago

Thanks, I'll try this. :full_moon_with_face:

LorenzoSun-V commented 2 years ago

I'm here again. :sweat_smile:

I've tried the code above, and it didn't work. But an idea just occurred into my mind: the bn_params in the prototxt are used to load the sole bn layer params during post processing, but I've trained the model by QAT which means the params of that sole bn layer have been changed. So I extract the sole bn params in deployable.pth(export by QAT) and replace it into prototxt, got: image

The metrics are better than before, but still has a little loss compared to those on GPU. I'll try to debug it sometime, but now it seems to be OK. Thanks @DehuaTang .

DehuaTang commented 2 years ago

I think this error is acceptable. python and c++ versions of the preprocessing interpolation method may have errors.

LorenzoSun-V commented 2 years ago

OK, thanks.

IkrameBeggar commented 1 year ago

Hi, @DehuaTang .

I've used the offical released reid xmodel and slim model with sparsity=0.5. I compiled the xmodel and extracted the last bn params into the prototxt. And I test it by my c++ code and use python code to evaluate the metrics, got: image Which are the same as released README: image

This means my pre-post processing code should be correct. Here is my QAT code: image

Does this can export correct xmodel?

Hello, can you please share with me the QAT script that you have used, here is my email: ikrame.bgr@gmail.com. I am having an error with qat_processor. Here is the error I am having Screenshot from 2023-10-15 15-23-57

LorenzoSun-V commented 11 months ago

Hi, @IkrameBeggar I just commit the repo, you can refer this script. But it's based on a former version of VitisAI. Hope it's helpful.

IkrameBeggar commented 11 months ago

Hi, @IkrameBeggar I just commit the repo, you can refer this script. But it's based on a former version of VitisAI. Hope it's helpful.

Thank you very much. I really appreciate it