Closed Guriido closed 2 years ago
Hi @Guriido
Thank you very much for your answer @qianglin-xlnx
I already looked at the demo examples before submitting this issue, but all the implementations seem to assert that the model has only one DPU subgraph. (except maybe the pose detection example, but the model itself is the combination of two independent models, one for person detection and the other for pose inference, and thus the models are only fed true images and not intermediate feature map tensors or such) All VART demo samples are by the way explicitly checking for the existence of a single subgraph (c.f. https://github.com/Xilinx/Vitis-AI/blob/ffdfd826394dd24c6f828733f5cfa99ab4153940/demo/VART/resnet50/src/main.cc#L283) The link you gave was under the assumption that the model would be tuned to fit all operators for the DPU beforehand, is it correct?
I am aware of the limitations of each layer according to the DPU intrinsic parameter (bank depth etc.), but I was looking for a way to have access to all non-conform layers through a tool.
The non-conformity is checked when executing the compilation with vai_c_xir
command, so I thought there would be a way to display this information during compilation, through an environment variable or a debug flag or something like that.
Can you tell me if there is anything close to this, or maybe a script to check a .xmodel
according to an architecture configuration?
Thanks in advance
Hi @Guriido 2 Not exactly. Vitis AI LIbrary also support the model with more than 1 dpu subgraph, such as sp_net model. You can find the implementation code in the link below. https://github.com/Xilinx/Vitis-AI/blob/ffdfd826394dd24c6f828733f5cfa99ab4153940/tools/Vitis-AI-Library/posedetect/src/posedetect_imp.cpp#L50 https://github.com/Xilinx/Vitis-AI/blob/master/tools/Vitis-AI-Library/platenum/src/platenum_imp.cpp#L126
However, we recommend compiling the model into only one dpu subgraph as far as possible and running the model on the DPU. In this case, you will get the best performance of the model.
3 Thank you for your advice. So far, the unsupported layers or OPs are shown only when you compiler the model, such as the following. However, there is no tool or script to check the model according to an architecture configuration. Maybe we need some kind of this tool. I'm not sure.
[UNILOG][WARNING] xir::Op{name = PMGPMG_MaxPool2d_max2812, type = pool-fix} has been assigned to CPU: ["kernel_height(14) is not in DPU supported range [1, 2]]. [UNILOG][WARNING] xir::Op{name = PMGPMG_MaxPool2d_max3823, type = pool-fix} has been assigned to CPU: ["kernel_height(14) is not in DPU supported range [1, 2]]. [UNILOG][WARNING] xir::Op{name = PMGPMG_MaxPool2d_max4834, type = pool-fix} has been assigned to CPU: ["kernel_height(14) is not in DPU supported range [1, 2]]. [UNILOG][WARNING] xir::Op{name = PMGPMG_MaxPool2d_max1801, type = pool-fix} has been assigned to CPU: ["kernel_height(28) is not in DPU supported range [1, 2]].
@qianglin-xlnx
Thank you for the sample! It seems to require handcrafting the missing operator with CPU functions, so as you said it is best to compile the whole model in one DPU subgraph.
So far, the unsupported layers or OPs are shown only when you compiler the model, such as the following. [UNILOG][WARNING] xir::Op{name = PMGPMG_MaxPool2d_max2812, type = pool-fix} has been assigned to CPU: ["kernel_height(14) is not in DPU supported range [1, 2]].
This was exactly what I was looking for! But as you can see in the full log I shared in my first message, even if the number of dpu subgraphs after compilation is (way) superior to 1, there is no such warning message that appeared.
How can I enable those messages when using vai_c_xir
?
@Guriido 3 It's very strange. Which docker do you use and could you share quantize_result/EfficientDet_int.xmodel for our further debug? Thank you very much.
@qianglin-xlnx I used a docker image built from source through the script of the master branch https://github.com/Xilinx/Vitis-AI/blob/master/setup/docker/docker_build_gpu.sh
I just pulled the latest cpu version (docker pull xilinx/vitis-ai-cpu:latest
) and tested the compilation with this environment, getting an identical result (multiple dpu subgraphs and no warning).
I hosted my .xmodel
file before compilation at this link.
@qianglin-xlnx After checking my pytorch model, I noticed some convolutions were too big to fit in the DPU (due to channel_parallel and bank_depth constraints), and fixing these helped reduce the generated number of DPU subgraph from 20 to 4. I tried to check the other layer types as well, but could not find other discrepancies with the spec of ZU3 DPU.
I checked manually the computation graph of the compiled model, and find some subgraphs mapped to CPU (I guess when generated with xir svg <model> <.svg>
command the CPU parts are in red?) that do not have equivalent in the original implementation, and seem to correspond to a dummy operation?
It is composed of fix2float
-> add
with some scalar constant -> float2fix
, but how could I get rid of it?
I tried to search in the issues of this repository and the forums, but could not find answers.
Thank you in advance for your help
Hi @Guriido I've already reported this internally. Currently, what we found is that the information about unsupported ops is not printed at compile time because these ops are not quantized. We will look into this issue further.
@qianglin-xlnx Thank you for looking into it! I don't know if this information has any use, but during the quantization (calib) and quantization (test)/deploy, there were no errors nor warnings as well.
Hi all,
I have got same issue as @Guriido. I checked the output graph after compiling. There are two sub-graphs are not executed by DPU as below so my graph is divided into 4 DPU sub-graphs. Writing application for that is not easy for me.
fix2float -> resize -> float2fix -> other subgraph
^
|
const --> stack
I am look forward to the solution.
Thank you!
@quyetvk I am curious about your graph, did you have an Interpolate
or similar function at this place in your Pytorch model?
Do you have checked the shapes of the input and output of the subgraph you showed? (at fix2float and float2fix operators)
If the answer to the first question is yes, it may be that your resize function is not compliant with the restrictions for DPU conversion (check the docs here at the bottom of page 101)
@Guriido Yes, we have Interpolate. I put my subgraph here:
I think you are right. Maybe the resize function is not compliant with restrictions. Thank you for your referrence
Hi @quyetvk Has this issue been solved?
Hi @quyetvk Since we haven't received your reply for a long time, we assume you have solved this issue and I'm going to close it. If you still have any questions, please feel free to reopen it. Thank you very much.
Hello, I'm trying to use Vitis-AI software to use a pytorch model on a DPU. I successfully generated the quantized model and deployed it to a
xxx_int.xmodel
file, and then tried to compile it with thevai_c_xir
command as specified in the documents. The compilation finished without any error message nor warning, and the three files (md5sum.txt
,meta.json
compiled_model.xmodel
) were generated, but the number of DPU subgraph was 20, so I can not execute it directly on my DPU.The output of compilation is the following:
I tried to compile for the ZCU102/104 architecture, and for the Ultra96 architecture, with the same number of DPU subgraphs as result. (The development environment is the latest vitis-ai docker image, using vitis-ai 1.3)
I have several questions concerning this:
1) The reason the compiled model has so many DPU subgraphs is that the model contained operations not supported for the DPU?
2) Is there a simple way, using the VART python API, to do inference of the whole model from the input image? (without having to engineer the input and output of each subgraph by hand) I am thinking about something in the lines of https://github.com/Xilinx/Vitis-AI/blob/master/demo/VART/resnet50_mt_py/resnet50.py
3) (supposing 1) answer is yes) Is there a way to know the reason why the layers were not supported by DPU during compilation, something like a verbose mode or the log of compilation? There is a mention
[UNILOG][INFO] The compiler log will be dumped at ...
, but I checked at this location and the supposed log is just an empty folder.Thank you for your consideration