Xilinx / Vitis-AI

Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
https://www.xilinx.com/ai
Apache License 2.0
1.49k stars 630 forks source link

Custom model has several outputs, but VAI_C_XIR compiler does not generate all of them #809

Closed daniperfer closed 2 years ago

daniperfer commented 2 years ago

Hi:

I'm running a model with Vitis-AI flow. I need to postprocess the output of 4 intermediate layers of the model aswell as the final outptus. My pytorch model is designed in this way, and the quantized model provide all these outputs (the intermediate and the final ones). However, when I compile the model with vai_c_xir, only the final outputs appear as outputs in the compiled graph.

vai_c_xir --xmodel ./quantization_results/RetinaNet_mod_0_int.xmodel --arch /opt/vitis_ai/compiler/arch/DPUCZDX8G/ZCU102/arch.json --net_name RetinaNet_mod_resnet18_zcu102 --output_dir compiled_model


* VITIS_AI Compilation - Xilinx Inc.
**************************************************
[UNILOG][INFO] Compile mode: dpu
[UNILOG][INFO] Debug mode: function
[UNILOG][INFO] Target architecture: DPUCZDX8G_ISA0_B4096_MAX_BG2
[UNILOG][INFO] Graph name: RetinaNet_mod_0, with op num: 589
[UNILOG][INFO] Begin to compile...
[UNILOG][WARNING] xir::Op{name = RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetRegressionHead_mod_regression_head__10981, type = concat-fix} has been assigned to CPU: [Input xir::Op{name = RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetRegressionHead_mod_regression_head__10588, type = reshape-fix} is not in DPU subgraph. And output dimension is not 4.].
[UNILOG][WARNING] xir::Op{name = RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_head__10458, type = concat-fix} has been assigned to CPU: [Input xir::Op{name = RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_head__10065, type = reshape-fix} is not in DPU subgraph. And output dimension is not 4.].
[UNILOG][WARNING] xir::Op{name = RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetRegressionHead_mod_regression_head__10981, type = concat-fix} has been assigned to CPU: [Input xir::Op{name = RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetRegressionHead_mod_regression_head__10588, type = reshape-fix} is not in DPU subgraph. And output dimension is not 4.].
[UNILOG][WARNING] xir::Op{name = RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_head__10458, type = concat-fix} has been assigned to CPU: [Input xir::Op{name = RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_head__10065, type = reshape-fix} is not in DPU subgraph. And output dimension is not 4.].
[UNILOG][INFO] Total device subgraph number 4, DPU subgraph number 1
[UNILOG][INFO] Compile done.
[UNILOG][INFO] The meta json is saved to "/workspace/first-steps/torchvision/compiled_model/meta.json"
[UNILOG][INFO] The compiled xmodel is saved to "/workspace/first-steps/torchvision/compiled_model/RetinaNet_mod_resnet18_zcu102.xmodel"
[UNILOG][INFO] The compiled xmodel's md5sum is e9e55cd7d611d4bdb83abdd379405cfa, and has been saved to "/workspace/first-steps/torchvision/compiled_model/md5sum.txt"

I've noticed that there is an --outputs argument in vai_c_xir which can be used for specifying outputs for the compiled model. Then, I've tried to run again vai_c_xir on the quantized model, but now specifying the intermediate outputs as outptus for the compiled model (as well as the final outputs).

vai_c_xir --xmodel ./quantization_results/RetinaNet_mod_0_int.xmodel --arch /opt/vitis_ai/compiler/arch/DPUCZDX8G/ZCU102/arch.json --net_name RetinaNet_mod_resnet18_zcu102 --output_dir compiled_model --options '{"output_ops": "RetinaNet_mod__RetinaNet_mod_BackboneWithFPN_backbone__FeaturePyramidNetwork_fpn__LastLevelMaxPool_extra_blocks__input_86,RetinaNet_mod__RetinaNet_mod_BackboneWithFPN_backbone__FeaturePyramidNetwork_fpn__Conv2d_layer_blocks__ModuleList_2__input_77,RetinaNet_mod__RetinaNet_mod_BackboneWithFPN_backbone__FeaturePyramidNetwork_fpn__Conv2d_layer_blocks__ModuleList_0__input_59,RetinaNet_mod__RetinaNet_mod_BackboneWithFPN_backbone__FeaturePyramidNetwork_fpn__Conv2d_layer_blocks__ModuleList_1__input_68,RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetRegressionHead_mod_regression_head__Conv2d_bbox_reg__10817,RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_head__Conv2d_cls_logits__10294,RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetRegressionHead_mod_regression_head__Conv2d_bbox_reg__10947,RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_head__Conv2d_cls_logits__10424,RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetRegressionHead_mod_regression_head__Conv2d_bbox_reg__10687,RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_head__Conv2d_cls_logits__10164,RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetRegressionHead_mod_regression_head__Conv2d_bbox_reg__10557,RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_head__Conv2d_cls_logits__10034"}'

But if I do that, I'll get several warnings (assigning it as an output op will change the graph structure which may degrade performance) and the DPU subgraph number will increase from 1 to 9, which is not optimal for further deploying the compiled model...


  • VITIS_AI Compilation - Xilinx Inc.

    [UNILOG][INFO] xir::Op{name = RetinaNet_modRetinaNet_mod_RetinaNetHead_mod_headRetinaNetClassificationHead_mod_classification_headConv2d_cls_logits10034, type = conv2d} has been assigned to be an output op. [UNILOG][INFO] xir::Op{name = RetinaNet_modRetinaNet_mod_RetinaNetHead_mod_headRetinaNetRegressionHead_mod_regression_headConv2d_bbox_reg10557, type = conv2d} has been assigned to be an output op. [UNILOG][INFO] xir::Op{name = RetinaNet_modRetinaNet_mod_RetinaNetHead_mod_headRetinaNetClassificationHead_mod_classification_headConv2d_cls_logits10294, type = conv2d} has been assigned to be an output op. [UNILOG][INFO] xir::Op{name = RetinaNet_modRetinaNet_mod_RetinaNetHead_mod_headRetinaNetRegressionHead_mod_regression_headConv2d_bbox_reg10817, type = conv2d} has been assigned to be an output op. [UNILOG][INFO] xir::Op{name = RetinaNet_modRetinaNet_mod_BackboneWithFPN_backbone__FeaturePyramidNetwork_fpnConv2d_layer_blocksModuleList_1input_68, type = conv2d} has been assigned to be an output op. [UNILOG][INFO] xir::Op{name = RetinaNet_modRetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_headConv2d_cls_logits10164, type = conv2d} has been assigned to be an output op. [UNILOG][INFO] xir::Op{name = RetinaNet_modRetinaNet_mod_RetinaNetHead_mod_headRetinaNetRegressionHead_mod_regression_headConv2d_bbox_reg10687, type = conv2d} has been assigned to be an output op. [UNILOG][INFO] xir::Op{name = RetinaNet_mod__RetinaNet_mod_BackboneWithFPN_backboneFeaturePyramidNetwork_fpnConv2d_layer_blocksModuleList_0input_59, type = conv2d} has been assigned to be an output op. [UNILOG][INFO] xir::Op{name = RetinaNet_modRetinaNet_mod_BackboneWithFPN_backboneFeaturePyramidNetwork_fpnConv2d_layer_blocksModuleList_2input_77, type = conv2d} has been assigned to be an output op. [UNILOG][INFO] xir::Op{name = RetinaNet_modRetinaNet_mod_BackboneWithFPN_backbone__FeaturePyramidNetwork_fpnLastLevelMaxPool_extra_blocksinput_86, type = maxpool2d} has been assigned to be an output op. [UNILOG][INFO] xir::Op{name = RetinaNet_modRetinaNet_mod_RetinaNetHead_mod_headRetinaNetClassificationHead_mod_classification_headConv2d_cls_logits10424, type = conv2d} has been assigned to be an output op. [UNILOG][INFO] xir::Op{name = RetinaNet_modRetinaNet_mod_RetinaNetHead_mod_headRetinaNetRegressionHead_mod_regression_headConv2d_bbox_reg10947, type = conv2d} has been assigned to be an output op. [UNILOG][WARNING] xir::Op{name = RetinaNet_modRetinaNet_mod_BackboneWithFPN_backboneFeaturePyramidNetwork_fpnConv2d_layer_blocksModuleList_1input_68, type = conv2d} is not a "fix" op, and assigning it as an output op will change the graph structure which may degrade performance. [UNILOG][WARNING] xir::Op{name = RetinaNet_modRetinaNet_mod_BackboneWithFPN_backbone__FeaturePyramidNetwork_fpnConv2d_layer_blocksModuleList_0input_59, type = conv2d} is not a "fix" op, and assigning it as an output op will change the graph structure which may degrade performance. [UNILOG][WARNING] xir::Op{name = RetinaNet_modRetinaNet_mod_BackboneWithFPN_backboneFeaturePyramidNetwork_fpnConv2d_layer_blocksModuleList_2input_77, type = conv2d} is not a "fix" op, and assigning it as an output op will change the graph structure which may degrade performance. [UNILOG][WARNING] xir::Op{name = RetinaNet_modRetinaNet_mod_BackboneWithFPN_backbone__FeaturePyramidNetwork_fpnLastLevelMaxPool_extra_blocksinput_86, type = maxpool2d} is not a "fix" op, and assigning it as an output op will change the graph structure which may degrade performance. [UNILOG][INFO] Compile mode: dpu [UNILOG][INFO] Debug mode: function [UNILOG][INFO] Target architecture: DPUCZDX8G_ISA0_B4096_MAX_BG2 [UNILOG][INFO] Graph name: RetinaNet_mod_0, with op num: 477 [UNILOG][INFO] Begin to compile... [UNILOG][INFO] Total device subgraph number 21, DPU subgraph number 9 [UNILOG][INFO] Compile done. [UNILOG][INFO] The meta json is saved to "/workspace/andante/first-steps/torchvision/compiled_model/meta.json" [UNILOG][INFO] The compiled xmodel is saved to "/workspace/andante/first-steps/torchvision/compiled_model/RetinaNet_mod_resnet18_zcu102.xmodel" [UNILOG][INFO] The compiled xmodel's md5sum is 9ae732c11ab3a6020c68611959d797d6, and has been saved to "/workspace/andante/first-steps/torchvision/compiled_model/md5sum.txt"

Do you know if there is some way of getting intermediate layers as outputs without getting these warnings and without increasing the DPU subgraph number?

daniperfer commented 2 years ago

UPDATE I've finally added the _fix suffix to the intermediate layers which I want to use as outputs in my case, and then the output of vai_c_xir command did not show any warning, and the DPU subgraph number is 1:

vai_c_xir --xmodel ./quantization_results/RetinaNet_mod_0_int.xmodel --arch /opt/vitis_ai/compiler/arch/DPUCZDX8G/ZCU102/arch.json --net_name RetinaNet_mod_resnet18_zcu102 --output_dir compiled_model --options '{"output_ops": "RetinaNet_mod__RetinaNet_mod_BackboneWithFPN_backbone__FeaturePyramidNetwork_fpn__LastLevelMaxPool_extra_blocks__input_86_fix,RetinaNet_mod__RetinaNet_mod_BackboneWithFPN_backbone__FeaturePyramidNetwork_fpn__Conv2d_layer_blocks__ModuleList_2__input_77_fix,RetinaNet_mod__RetinaNet_mod_BackboneWithFPN_backbone__FeaturePyramidNetwork_fpn__Conv2d_layer_blocks__ModuleList_0__input_59_fix,RetinaNet_mod__RetinaNet_mod_BackboneWithFPN_backbone__FeaturePyramidNetwork_fpn__Conv2d_layer_blocks__ModuleList_1__input_68_fix,RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetRegressionHead_mod_regression_head__Conv2d_bbox_reg__10817,RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_head__Conv2d_cls_logits__10294,RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetRegressionHead_mod_regression_head__Conv2d_bbox_reg__10947,RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_head__Conv2d_cls_logits__10424,RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetRegressionHead_mod_regression_head__Conv2d_bbox_reg__10687,RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_head__Conv2d_cls_logits__10164,RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetRegressionHead_mod_regression_head__Conv2d_bbox_reg__10557,RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_head__Conv2d_cls_logits__10034"}'


* VITIS_AI Compilation - Xilinx Inc.

**************************************************

[UNILOG][INFO] xir::Op{name = RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_head__Conv2d_cls_logits__10034, type = conv2d} has been assigned to be an output op.

[UNILOG][INFO] xir::Op{name = RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetRegressionHead_mod_regression_head__Conv2d_bbox_reg__10557, type = conv2d} has been assigned to be an output op.

[UNILOG][INFO] xir::Op{name = RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_head__Conv2d_cls_logits__10294, type = conv2d} has been assigned to be an output op.

[UNILOG][INFO] xir::Op{name = RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetRegressionHead_mod_regression_head__Conv2d_bbox_reg__10817, type = conv2d} has been assigned to be an output op.

[UNILOG][INFO] xir::Op{name = RetinaNet_mod__RetinaNet_mod_BackboneWithFPN_backbone__FeaturePyramidNetwork_fpn__Conv2d_layer_blocks__ModuleList_1__input_68_fix, type = fix} has been assigned to be an output op.

[UNILOG][INFO] xir::Op{name = RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_head__Conv2d_cls_logits__10164, type = conv2d} has been assigned to be an output op.

[UNILOG][INFO] xir::Op{name = RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetRegressionHead_mod_regression_head__Conv2d_bbox_reg__10687, type = conv2d} has been assigned to be an output op.

[UNILOG][INFO] xir::Op{name = RetinaNet_mod__RetinaNet_mod_BackboneWithFPN_backbone__FeaturePyramidNetwork_fpn__Conv2d_layer_blocks__ModuleList_0__input_59_fix, type = fix} has been assigned to be an output op.

[UNILOG][INFO] xir::Op{name = RetinaNet_mod__RetinaNet_mod_BackboneWithFPN_backbone__FeaturePyramidNetwork_fpn__Conv2d_layer_blocks__ModuleList_2__input_77_fix, type = fix} has been assigned to be an output op.

[UNILOG][INFO] xir::Op{name = RetinaNet_mod__RetinaNet_mod_BackboneWithFPN_backbone__FeaturePyramidNetwork_fpn__LastLevelMaxPool_extra_blocks__input_86_fix, type = fix} has been assigned to be an output op.

[UNILOG][INFO] xir::Op{name = RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetClassificationHead_mod_classification_head__Conv2d_cls_logits__10424, type = conv2d} has been assigned to be an output op.

[UNILOG][INFO] xir::Op{name = RetinaNet_mod__RetinaNet_mod_RetinaNetHead_mod_head__RetinaNetRegressionHead_mod_regression_head__Conv2d_bbox_reg__10947, type = conv2d} has been assigned to be an output op.

[UNILOG][INFO] Compile mode: dpu

[UNILOG][INFO] Debug mode: function

[UNILOG][INFO] Target architecture: DPUCZDX8G_ISA0_B4096_MAX_BG2

[UNILOG][INFO] Graph name: RetinaNet_mod_0, with op num: 477

[UNILOG][INFO] Begin to compile...

[UNILOG][INFO] Total device subgraph number 14, DPU subgraph number 1

[UNILOG][INFO] Compile done.

[UNILOG][INFO] The meta json is saved to "/workspace/first-steps/torchvision/compiled_model/meta.json"

[UNILOG][INFO] The compiled xmodel is saved to "/workspace/first-steps/torchvision/compiled_model/RetinaNet_mod_resnet18_zcu102.xmodel"

[UNILOG][INFO] The compiled xmodel's md5sum is 2106c40ee342b278e56f5820c324cfe7, and has been saved to "/workspace/first-steps/torchvision/compiled_model/md5sum.txt"