Xilinx / Vitis-AI

Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
https://www.xilinx.com/ai
Apache License 2.0
1.49k stars 633 forks source link

There are large differences between qat_processor.deployable_model and qat_processor.trainable_model #1359

Open zhihaofan opened 1 year ago

zhihaofan commented 1 year ago

After I did the quantization training, I found that the test results of the model transformed by qat_processor.trainable_model(allow_reused_module=True) were normal, but when I deployed it with qat_processor.deployable_model(args. output_dir, used_for_xmodel=True) the converted model has very serious error. Below is my code for quant and deploy:

if args.quant:
        input1 = torch.randn(1, 3, 352, 480).to(args.device)
        input2 = torch.randn(1, 3, 352, 480).to(args.device)
        input3 = torch.randn(1, 3, 352, 480).to(args.device)
        inputs = (input1, input2, input3)
        qat_processor = QatProcessor(net, inputs=inputs, bitwidth=8)

        quantized_model = qat_processor.trainable_model(allow_reused_module=True)
        deployable_net = qat_processor.to_deployable(quantized_model.cpu(), output_dir=args.output_dir)
        val(valid_loader, quantized_model , args.batch_size, dtype, device, -1, args.output_dir)
elif args.deploy:
        input1 = torch.randn(1, 3, 352, 480).to(args.device)
        input2 = torch.randn(1, 3, 352, 480).to(args.device)
        input3 = torch.randn(1, 3, 352, 480).to(args.device)
        inputs = (input1, input2, input3)

        qat_processor = QatProcessor(net, inputs=inputs, bitwidth=8)
        deployable_model = qat_processor.deployable_model(args.output_dir, used_for_xmodel=True)
        deployable_model(inputs)
        qat_processor.export_xmodel(args.output_dir, deploy_check=True)

        val(valid_loader, deployable_model, args.batch_size, dtype, device, -1, args.output_dir)
        exit(f'====================================== deploy completed! ==============================================')

When I run the quant branch, the VAL results are shown in Figure 1, and when I run the deploy branch, the VAL results are shown in Figure 2. The inputs and mods are the same for both branches, but the results are much worse. image Figure1

image Figure2

Any help would be great. Thanks!

zhihaofan commented 1 year ago

Regarding the input, because to minimize the error after pytorch network training and quantization, I mapped the input to the range of (-1, 1), i.e., the original input image was (0, 255) Then the input /127 - 1 was used as the input to the network. Can this step cause this problem? Because previous attempts input/255 were not encountering this problem.