Xilinx / Vitis-AI-Tutorials

MIT License
358 stars 144 forks source link

The compiled model failed in "overly.load_model" #52

Closed SoldierChen closed 2 years ago

SoldierChen commented 2 years ago

Hi all,

I followed the example in https://github.com/Xilinx/Vitis-AI-Tutorials/tree/master/Design_Tutorials/09-mnist_pyt and successfully quantized my model and then compiled it to be CNN_zcu102.xmodel.

However, when I load the xmodel to the pynq dpu overlay (KV260), it showed the following error. Any advice on the problem?

926cdd1709148168e2024ccb8ed3baa

It is noteworthy that I can successfully load the xmodel compiled from the example model. The problem comes only when I change to my model, and there are no warnings during quantization and compiling.

I attach the quantized model for your reference. `

GENETARED BY NNDCT, DO NOT EDIT!

import torch import pytorch_nndct as py_nndct class PoseResNet(torch.nn.Module): def init(self): super(PoseResNet, self).init() self.module_0 = py_nndct.nn.Input() #PoseResNet::input_0 self.module_1 = py_nndct.nn.Conv2d(in_channels=3, out_channels=64, kernel_size=[7, 7], stride=[2, 2], padding=[3, 3], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Conv2d[conv1]/input.2 self.module_3 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/ReLU[relu]/3604 self.module_4 = py_nndct.nn.MaxPool2d(kernel_size=[3, 3], stride=[2, 2], padding=[1, 1], dilation=[1, 1], ceil_mode=False) #PoseResNet::PoseResNet/MaxPool2d[maxpool]/input.4 self.module_5 = py_nndct.nn.Conv2d(in_channels=64, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[0]/Conv2d[conv1]/input.5 self.module_7 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[0]/ReLU[relu]/input.7 self.module_8 = py_nndct.nn.Conv2d(in_channels=64, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[0]/Conv2d[conv2]/input.8 self.module_10 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[0]/input.9 self.module_11 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[0]/ReLU[relu]/input.10 self.module_12 = py_nndct.nn.Conv2d(in_channels=64, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[1]/Conv2d[conv1]/input.11 self.module_14 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[1]/ReLU[relu]/input.13 self.module_15 = py_nndct.nn.Conv2d(in_channels=64, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[1]/Conv2d[conv2]/input.14 self.module_17 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[1]/input.15 self.module_18 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer1]/BasicBlock[1]/ReLU[relu]/input.16 self.module_19 = py_nndct.nn.Conv2d(in_channels=64, out_channels=128, kernel_size=[3, 3], stride=[2, 2], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/Conv2d[conv1]/input.17 self.module_21 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/ReLU[relu]/input.19 self.module_22 = py_nndct.nn.Conv2d(in_channels=128, out_channels=128, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/Conv2d[conv2]/input.20 self.module_24 = py_nndct.nn.Conv2d(in_channels=64, out_channels=128, kernel_size=[1, 1], stride=[2, 2], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/Sequential[downsample]/Conv2d[0]/input.21 self.module_26 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/input.22 self.module_27 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[0]/ReLU[relu]/input.23 self.module_28 = py_nndct.nn.Conv2d(in_channels=128, out_channels=128, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[1]/Conv2d[conv1]/input.24 self.module_30 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[1]/ReLU[relu]/input.26 self.module_31 = py_nndct.nn.Conv2d(in_channels=128, out_channels=128, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[1]/Conv2d[conv2]/input.27 self.module_33 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[1]/input.28 self.module_34 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer2]/BasicBlock[1]/ReLU[relu]/input.29 self.module_35 = py_nndct.nn.Conv2d(in_channels=128, out_channels=256, kernel_size=[3, 3], stride=[2, 2], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/Conv2d[conv1]/input.30 self.module_37 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/ReLU[relu]/input.32 self.module_38 = py_nndct.nn.Conv2d(in_channels=256, out_channels=256, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/Conv2d[conv2]/input.33 self.module_40 = py_nndct.nn.Conv2d(in_channels=128, out_channels=256, kernel_size=[1, 1], stride=[2, 2], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/Sequential[downsample]/Conv2d[0]/input.34 self.module_42 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/input.35 self.module_43 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[0]/ReLU[relu]/input.36 self.module_44 = py_nndct.nn.Conv2d(in_channels=256, out_channels=256, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[1]/Conv2d[conv1]/input.37 self.module_46 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[1]/ReLU[relu]/input.39 self.module_47 = py_nndct.nn.Conv2d(in_channels=256, out_channels=256, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[1]/Conv2d[conv2]/input.40 self.module_49 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[1]/input.41 self.module_50 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer3]/BasicBlock[1]/ReLU[relu]/input.42 self.module_51 = py_nndct.nn.Conv2d(in_channels=256, out_channels=512, kernel_size=[3, 3], stride=[2, 2], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/Conv2d[conv1]/input.43 self.module_53 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/ReLU[relu]/input.45 self.module_54 = py_nndct.nn.Conv2d(in_channels=512, out_channels=512, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/Conv2d[conv2]/input.46 self.module_56 = py_nndct.nn.Conv2d(in_channels=256, out_channels=512, kernel_size=[1, 1], stride=[2, 2], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/Sequential[downsample]/Conv2d[0]/input.47 self.module_58 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/input.48 self.module_59 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[0]/ReLU[relu]/input.49 self.module_60 = py_nndct.nn.Conv2d(in_channels=512, out_channels=512, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[1]/Conv2d[conv1]/input.50 self.module_62 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[1]/ReLU[relu]/input.52 self.module_63 = py_nndct.nn.Conv2d(in_channels=512, out_channels=512, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[1]/Conv2d[conv2]/input.53 self.module_65 = py_nndct.nn.Add() #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[1]/input.54 self.module_66 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[layer4]/BasicBlock[1]/ReLU[relu]/4125 self.module_67 = py_nndct.nn.ConvTranspose2d(in_channels=512, out_channels=256, kernel_size=[4, 4], stride=[2, 2], padding=[1, 1], output_padding=[0, 0], groups=1, bias=True, dilation=[1, 1]) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ConvTranspose2d[0]/input.55 self.module_69 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ReLU[2]/4151 self.module_70 = py_nndct.nn.ConvTranspose2d(in_channels=256, out_channels=256, kernel_size=[4, 4], stride=[2, 2], padding=[1, 1], output_padding=[0, 0], groups=1, bias=True, dilation=[1, 1]) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ConvTranspose2d[3]/input.57 self.module_72 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ReLU[5]/4177 self.module_73 = py_nndct.nn.ConvTranspose2d(in_channels=256, out_channels=256, kernel_size=[4, 4], stride=[2, 2], padding=[1, 1], output_padding=[0, 0], groups=1, bias=True, dilation=[1, 1]) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ConvTranspose2d[6]/input.59 self.module_75 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[deconv_layers]/ReLU[8]/input.61 self.module_76 = py_nndct.nn.Conv2d(in_channels=256, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[hm_cen]/Conv2d[0]/input.62 self.module_77 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[hm_cen]/ReLU[1]/input.63 self.module_78 = py_nndct.nn.Conv2d(in_channels=64, out_channels=3, kernel_size=[1, 1], stride=[1, 1], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[hm_cen]/Conv2d[2]/4242 self.module_79 = py_nndct.nn.Conv2d(in_channels=256, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[cen_offset]/Conv2d[0]/input.64 self.module_80 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[cen_offset]/ReLU[1]/input.65 self.module_81 = py_nndct.nn.Conv2d(in_channels=64, out_channels=2, kernel_size=[1, 1], stride=[1, 1], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[cen_offset]/Conv2d[2]/4281 self.module_82 = py_nndct.nn.Conv2d(in_channels=256, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[direction]/Conv2d[0]/input.66 self.module_83 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[direction]/ReLU[1]/input.67 self.module_84 = py_nndct.nn.Conv2d(in_channels=64, out_channels=2, kernel_size=[1, 1], stride=[1, 1], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[direction]/Conv2d[2]/4320 self.module_85 = py_nndct.nn.Conv2d(in_channels=256, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[z_coor]/Conv2d[0]/input.68 self.module_86 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[z_coor]/ReLU[1]/input.69 self.module_87 = py_nndct.nn.Conv2d(in_channels=64, out_channels=1, kernel_size=[1, 1], stride=[1, 1], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[z_coor]/Conv2d[2]/4359 self.module_88 = py_nndct.nn.Conv2d(in_channels=256, out_channels=64, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[dim]/Conv2d[0]/input.70 self.module_89 = py_nndct.nn.ReLU(inplace=True) #PoseResNet::PoseResNet/Sequential[dim]/ReLU[1]/input self.module_90 = py_nndct.nn.Conv2d(in_channels=64, out_channels=3, kernel_size=[1, 1], stride=[1, 1], padding=[0, 0], dilation=[1, 1], groups=1, bias=True) #PoseResNet::PoseResNet/Sequential[dim]/Conv2d[2]/4398

def forward(self, *args):
    output_module_0 = self.module_0(input=args[0])
    output_module_0 = self.module_1(output_module_0)
    output_module_0 = self.module_3(output_module_0)
    output_module_0 = self.module_4(output_module_0)
    output_module_5 = self.module_5(output_module_0)
    output_module_5 = self.module_7(output_module_5)
    output_module_5 = self.module_8(output_module_5)
    output_module_5 = self.module_10(input=output_module_5, other=output_module_0, alpha=1)
    output_module_5 = self.module_11(output_module_5)
    output_module_12 = self.module_12(output_module_5)
    output_module_12 = self.module_14(output_module_12)
    output_module_12 = self.module_15(output_module_12)
    output_module_12 = self.module_17(input=output_module_12, other=output_module_5, alpha=1)
    output_module_12 = self.module_18(output_module_12)
    output_module_19 = self.module_19(output_module_12)
    output_module_19 = self.module_21(output_module_19)
    output_module_19 = self.module_22(output_module_19)
    output_module_24 = self.module_24(output_module_12)
    output_module_19 = self.module_26(input=output_module_19, other=output_module_24, alpha=1)
    output_module_19 = self.module_27(output_module_19)
    output_module_28 = self.module_28(output_module_19)
    output_module_28 = self.module_30(output_module_28)
    output_module_28 = self.module_31(output_module_28)
    output_module_28 = self.module_33(input=output_module_28, other=output_module_19, alpha=1)
    output_module_28 = self.module_34(output_module_28)
    output_module_35 = self.module_35(output_module_28)
    output_module_35 = self.module_37(output_module_35)
    output_module_35 = self.module_38(output_module_35)
    output_module_40 = self.module_40(output_module_28)
    output_module_35 = self.module_42(input=output_module_35, other=output_module_40, alpha=1)
    output_module_35 = self.module_43(output_module_35)
    output_module_44 = self.module_44(output_module_35)
    output_module_44 = self.module_46(output_module_44)
    output_module_44 = self.module_47(output_module_44)
    output_module_44 = self.module_49(input=output_module_44, other=output_module_35, alpha=1)
    output_module_44 = self.module_50(output_module_44)
    output_module_51 = self.module_51(output_module_44)
    output_module_51 = self.module_53(output_module_51)
    output_module_51 = self.module_54(output_module_51)
    output_module_56 = self.module_56(output_module_44)
    output_module_51 = self.module_58(input=output_module_51, other=output_module_56, alpha=1)
    output_module_51 = self.module_59(output_module_51)
    output_module_60 = self.module_60(output_module_51)
    output_module_60 = self.module_62(output_module_60)
    output_module_60 = self.module_63(output_module_60)
    output_module_60 = self.module_65(input=output_module_60, other=output_module_51, alpha=1)
    output_module_60 = self.module_66(output_module_60)
    output_module_60 = self.module_67(output_module_60)
    output_module_60 = self.module_69(output_module_60)
    output_module_60 = self.module_70(output_module_60)
    output_module_60 = self.module_72(output_module_60)
    output_module_60 = self.module_73(output_module_60)
    output_module_60 = self.module_75(output_module_60)
    output_module_76 = self.module_76(output_module_60)
    output_module_76 = self.module_77(output_module_76)
    output_module_76 = self.module_78(output_module_76)
    output_module_79 = self.module_79(output_module_60)
    output_module_79 = self.module_80(output_module_79)
    output_module_79 = self.module_81(output_module_79)
    output_module_82 = self.module_82(output_module_60)
    output_module_82 = self.module_83(output_module_82)
    output_module_82 = self.module_84(output_module_82)
    output_module_85 = self.module_85(output_module_60)
    output_module_85 = self.module_86(output_module_85)
    output_module_85 = self.module_87(output_module_85)
    output_module_88 = self.module_88(output_module_60)
    output_module_88 = self.module_89(output_module_88)
    output_module_88 = self.module_90(output_module_88)
    return output_module_76,output_module_79,output_module_82,output_module_85,output_module_88

`

fanz-xlnx commented 2 years ago

Hi

Thanks for the question. I did not expect any difference between the DPU overlay on KV260 and ZCU102. So I assume the tutorial can be migrate to KV260 without issues. Your problem is all about the difference between the original xmodel and the one you compiled. Please correct me, if I don't get it right.

What is the Vitis AI version you are currently using? Which arch do you apply during compilation?

SoldierChen commented 2 years ago

I am using the latest Vitis AI. I can use the toolchain to compile the official example xmodel and run it on KV260. I am using the same arch as in the example. What I did is just change the official model to my model. Other things are not changed. And I did not encounter any error during complication.

SoldierChen commented 2 years ago

@fanz-xlnx Thanks for your assistance, the issue was not solved yet. I close the issue by mistake.

fanz-xlnx commented 2 years ago

I am still confused about the issue you have met. Please correct me if I don't get it right.

You can get it run without issue if you are using the provided quantized model and compiled with VAI 2.0 toolchain. You have trouble when you are using your own quantized model and compiled with VAI 2.0 toolchain.

SoldierChen commented 2 years ago

@fanz-xlnx

Yes, your understanding is correct.

And, I do not encounter any errors or warnings during quantization and compilation. So I do not know why it cannot run on the board.