Closed AdrFebles closed 1 year ago
Hi!, I have seen in the user guide that DPU works with BHWC format. but I trained the network with BCHW tensor order, is this wrong?
Every model I've tried to quantize has complained during the verify_xmodel method performed at the end that the pytorch_nndct model output shape is not the same as the XIR output shape.
I even tried the example resnet18_quant files and the error persists.
THIS IS A HUGE BUG since PyTorch is ONLY B, C, H, W and XIR is ONLY B, H, W, C!!!!!!!!!!! Note, I've only attempted quantizing and compiling on the CPU.
I did a bit of digging and this does not occur during 1.4.1.978 git hash 9f3d6db. It seems to be something in the nndct_shared deploy_optimizer. In the 2.x many optimizations are called over the nndct_graph and specifically there is a function called layout_transform which does NOT change the format.
This is in stark contrast to the 1.4.1 version of get_deploy_graph_list which calls completely different graph optimizations
So the solution to my problem was running data through the quantized_model during the xmodel deployment stage as well as the calibration phase... Some documentation and less finicky tools would be a great addition...
@AdrFebles are you reshaping the data to be B, H, W, C before you pass them to the VART code executed in runCNNautoenc? Or are you keeping it in the PyTorch native B, C, H, W? I believe that you will need to do everything during the quantization and compilation stages using data in B, C, H, W but then during deployment with the final xmodel use B, H, W, C
So the solution to my problem was running data through the quantized_model during the xmodel deployment stage as well as the calibration phase... Some documentation and less finicky tools would be a great addition...
@AdrFebles are you reshaping the data to be B, H, W, C before you pass them to the VART code executed in runCNNautoenc? Or are you keeping it in the PyTorch native B, C, H, W? I believe that you will need to do everything during the quantization and compilation stages using data in B, C, H, W but then during deployment with the final xmodel use B, H, W, C
Hi! @michael-person, yes I followed the steps of this tutorial which helped me understand how to deploy the model on the board: https://github.com/Xilinx/Vitis-AI-Tutorials/tree/1.4/Design_Tutorials/11-tf2_var_autoenc They train and quantize the model in (B,C,H,W) format, but in the running app it's necessary to make a reshape with the input dimensions of the dpu runner.
Hmm I'm wondering if maybe the increase in loss you're seeing moving from the float version to the quantized version has to do with an output shape/data type issue rather than a data layout issue. If you're reshaping the input for the VART inference call then I think you're ok.
Can you verify that the shapes between out
and img
are the same? Also you may need to scale out
by the output_scale
value, you can see how its used in the CPUCalcSoftmax
method here
Thank you @michael-person
Closing since no activity for more than 3 months, please open a new issue if you still have any questions, thanks.
Hello! I've trained a model in Pytorch with tensor dimensions: (1,1,4,6), where the order is: (Batch, Channels, Height, Width). When I quantize the model I generate a random input with these dimensions and define the quantizer as follows:
When I test the quantized model there is no porblem with it, but when I compile the model and print the subraph.png file I realize the order of the tensors is switched and this model wait for an input with this size (1,4,6,1) with order: (Batch, Height, Width, Channels), which produce an abnormal performance of the model when it runs on ZCU102 board. Loss: 163 whereas the float model have losses of 0.001.
I have the runtime class defined as follows:
Could you please, help me to understand this behavior?
Thanks