facebookresearch / detectron2

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
https://detectron2.readthedocs.io/en/latest/
Apache License 2.0
30k stars 7.41k forks source link

Panoptic Segmentation Can Not be Exported to ONNX #4354

Open ghost opened 2 years ago

ghost commented 2 years ago

Instructions To Reproduce the 🐛 Bug:

  1. Full runnable code or full changes you made:

    I use the original repo and do not change anything.
  2. What exact command do you run:

    
    export DETECTRON2_DATASETS=/data/datasets/

python3 export_model.py \ --config-file ../../configs/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x.yaml \ --output /data/output/ \ --export-method caffe2_tracing \ --format onnx \ MODEL.WEIGHTS /data/model/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x/139514544/model_final_dbfeb4.pkl \ MODEL.DEVICE cpu


I have already prepared the COCO dataset in `/data/datasets/`.

3. __Full logs__ or other relevant observations:
[06/23 06:22:33 detectron2]: Command line arguments: Namespace(format='onnx', export_method='caffe2_tracing', config_file='../../configs/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x.yaml', sample_image=None, run_eval=False, output='/data/output/', opts=['MODEL.WEIGHTS', '/data/model/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x/139514544/model_final_dbfeb4.pkl', 'MODEL.DEVICE', 'cpu']) [W init.cpp:759] Warning: Use _jit_set_fusion_strategy, bailout depth is deprecated. Setting to (STATIC, 1) (function operator()) [06/23 06:22:35 d2.data.datasets.coco]: Loaded 5000 images in COCO format from /data/datasets/coco/annotations/instances_val2017.json [06/23 06:22:35 d2.data.datasets.coco]: Loaded 5000 images with semantic segmentation from /data/datasets/coco/val2017 [06/23 06:22:35 d2.data.build]: Distribution of instances among all 80 categories: category #instances category #instances category #instances
person 10777 bicycle 314 car 1918
motorcycle 367 airplane 143 bus 283
train 190 truck 414 boat 424
traffic light 634 fire hydrant 101 stop sign 75
parking meter 60 bench 411 bird 427
cat 202 dog 218 horse 272
sheep 354 cow 372 elephant 252
bear 71 zebra 266 giraffe 232
backpack 371 umbrella 407 handbag 540
tie 252 suitcase 299 frisbee 115
skis 241 snowboard 69 sports ball 260
kite 327 baseball bat 145 baseball gl.. 148
skateboard 179 surfboard 267 tennis racket 225
bottle 1013 wine glass 341 cup 895
fork 215 knife 325 spoon 253
bowl 623 banana 370 apple 236
sandwich 177 orange 285 broccoli 312
carrot 365 hot dog 125 pizza 284
donut 328 cake 310 chair 1771
couch 261 potted plant 342 bed 163
dining table 695 toilet 179 tv 288
laptop 231 mouse 106 remote 283
keyboard 153 cell phone 262 microwave 55
oven 143 toaster 9 sink 225
refrigerator 126 book 1129 clock 267
vase 274 scissors 36 teddy bear 190
hair drier 11 toothbrush 57
total 36335

[06/23 06:22:35 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1333, sample_style='choice')] [06/23 06:22:35 d2.data.common]: Serializing 5000 elements to byte tensors and concatenating them all ... [06/23 06:22:35 d2.data.common]: Serialized dataset takes 19.52 MiB Traceback (most recent call last): File "/data/detectron2/tools/deploy/export_model.py", line 217, in exported_model = export_caffe2_tracing(cfg, torch_model, sample_inputs) File "/data/detectron2/tools/deploy/export_model.py", line 36, in export_caffe2_tracing from detectron2.export import Caffe2Tracer ImportError: cannot import name 'Caffe2Tracer' from 'detectron2.export' (/usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/export/init.py)


4. please simplify the steps as much as possible so they do not require additional resources to
   run, such as a private dataset.

## Expected behavior:

I expected that the official code [export_model.py](https://github.com/facebookresearch/detectron2/blob/main/tools/deploy/export_model.py) can help me to export onnx model.

## Environment:

Provide your environment information using the following command:

sys.platform linux Python 3.10.4 (main, Apr 2 2022, 09:04:19) [GCC 11.2.0] numpy 1.22.4 detectron2 0.6 @/usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2 Compiler GCC 11.2 CUDA compiler not available DETECTRON2_ENV_MODULE PyTorch 1.11.0+cpu @/usr/local/lib/python3.10/dist-packages/torch PyTorch debug build False GPU available No: torch.cuda.is_available() == False Pillow 9.1.1 torchvision 0.12.0+cpu @/usr/local/lib/python3.10/dist-packages/torchvision fvcore 0.1.5.post20220512 iopath 0.1.9 cv2 4.6.0


PyTorch built with:

If your issue looks like an installation issue / environment issue, please first try to solve it yourself with the instructions in https://detectron2.readthedocs.io/tutorials/install.html#common-installation-issues

Finally, thanks to your effects that contribute such excellent repo!

ghost commented 2 years ago

Meanwhile, I have tried tracing method, after the above caffe2_tracing method.

Instructions To Reproduce the 🐛 Bug:

  1. Full runnable code or full changes you made:

    I use the original repo and do not change anything.
  2. What exact command do you run:

    
    export DETECTRON2_DATASETS=/data/datasets/

python3 export_model.py \ --config-file ../../configs/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x.yaml \ --output /data/output/ \ --export-method tracing \ --format onnx \ MODEL.WEIGHTS /data/model/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x/139514544/model_final_dbfeb4.pkl \ MODEL.DEVICE cpu


I have already prepared the COCO dataset in `/data/datasets/`.

4. __Full logs__ or other relevant observations:
[06/23 06:47:50 detectron2]: Command line arguments: Namespace(format='onnx', export_method='tracing', config_file='../../configs/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x.yaml', sample_image=None, run_eval=False, output='/data/output/', opts=['MODEL.WEIGHTS', '/data/model/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x/139514544/model_final_dbfeb4.pkl', 'MODEL.DEVICE', 'cpu']) [W init.cpp:759] Warning: Use _jit_set_fusion_strategy, bailout depth is deprecated. Setting to (STATIC, 1) (function operator()) [06/23 06:47:52 d2.data.datasets.coco]: Loaded 5000 images in COCO format from /data/datasets/coco/annotations/instances_val2017.json [06/23 06:47:52 d2.data.datasets.coco]: Loaded 5000 images with semantic segmentation from /data/datasets/coco/val2017 [06/23 06:47:52 d2.data.build]: Distribution of instances among all 80 categories: category #instances category #instances category #instances
person 10777 bicycle 314 car 1918
motorcycle 367 airplane 143 bus 283
train 190 truck 414 boat 424
traffic light 634 fire hydrant 101 stop sign 75
parking meter 60 bench 411 bird 427
cat 202 dog 218 horse 272
sheep 354 cow 372 elephant 252
bear 71 zebra 266 giraffe 232
backpack 371 umbrella 407 handbag 540
tie 252 suitcase 299 frisbee 115
skis 241 snowboard 69 sports ball 260
kite 327 baseball bat 145 baseball gl.. 148
skateboard 179 surfboard 267 tennis racket 225
bottle 1013 wine glass 341 cup 895
fork 215 knife 325 spoon 253
bowl 623 banana 370 apple 236
sandwich 177 orange 285 broccoli 312
carrot 365 hot dog 125 pizza 284
donut 328 cake 310 chair 1771
couch 261 potted plant 342 bed 163
dining table 695 toilet 179 tv 288
laptop 231 mouse 106 remote 283
keyboard 153 cell phone 262 microwave 55
oven 143 toaster 9 sink 225
refrigerator 126 book 1129 clock 267
vase 274 scissors 36 teddy bear 190
hair drier 11 toothbrush 57
total 36335

[06/23 06:47:52 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1333, sample_style='choice')] [06/23 06:47:52 d2.data.common]: Serializing 5000 elements to byte tensors and concatenating them all ... [06/23 06:47:53 d2.data.common]: Serialized dataset takes 19.52 MiB /usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/structures/image_list.py:85: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert t.shape[:-2] == tensors[0].shape[:-2], t.shape /usr/local/lib/python3.10/dist-packages/torch/nn/functional.py:2498: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). _verify_batch_size([input.size(0) * input.size(1) // num_groups, num_groups] + list(input.size()[2:])) /usr/local/lib/python3.10/dist-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2228.) return _VF.meshgrid(tensors, kwargs) # type: ignore[attr-defined] /usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/structures/boxes.py:155: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert tensor.dim() == 2 and tensor.size(-1) == 4, tensor.size() /usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/modeling/proposal_generator/proposal_utils.py:106: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if not valid_mask.all(): /usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/structures/boxes.py:191: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert torch.isfinite(self.tensor).all(), "Box tensor contains infinite or NaN!" /usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/structures/boxes.py:192: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results). h, w = box_size /usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/layers/nms.py:15: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert boxes.shape[-1] == 4 /usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/structures/instances.py:74: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results. data_len = len(value) /usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/modeling/poolers.py:231: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert len(box_lists) == x[0].size( /usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/layers/roi_align.py:55: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert rois.dim() == 2 and rois.size(1) == 5 /usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/modeling/roi_heads/fast_rcnn.py:138: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if not valid_mask.all(): /usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/modeling/roi_heads/fast_rcnn.py:143: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). num_bbox_reg_classes = boxes.shape[1] // 4 /usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/modeling/roi_heads/fast_rcnn.py:155: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if num_bbox_reg_classes == 1: /usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/modeling/roi_heads/mask_head.py:139: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if cls_agnostic_mask: Traceback (most recent call last): File "/data/detectron2/tools/deploy/export_model.py", line 221, in exported_model = export_tracing(torch_model, sample_inputs) File "/data/detectron2/tools/deploy/export_model.py", line 127, in export_tracing torch.onnx.export(traceable_model, (image,), f, opset_version=11) File "/usr/local/lib/python3.10/dist-packages/torch/onnx/init.py", line 305, in export return utils.export(model, args, f, export_params, verbose, training, File "/usr/local/lib/python3.10/dist-packages/torch/onnx/utils.py", line 118, in export _export(model, args, f, export_params, verbose, training, input_names, output_names, File "/usr/local/lib/python3.10/dist-packages/torch/onnx/utils.py", line 719, in _export _model_to_graph(model, args, verbose, input_names, File "/usr/local/lib/python3.10/dist-packages/torch/onnx/utils.py", line 503, in _model_to_graph graph = _optimize_graph(graph, operator_export_type, File "/usr/local/lib/python3.10/dist-packages/torch/onnx/utils.py", line 232, in _optimize_graph graph = torch._C._jit_pass_onnx(graph, operator_export_type) File "/usr/local/lib/python3.10/dist-packages/torch/onnx/init.py", line 354, in _run_symbolic_function return utils._run_symbolic_function(*args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/onnx/utils.py", line 1061, in _run_symbolic_function return symbolic_fn(g, inputs, attrs) File "/usr/local/lib/python3.10/dist-packages/torch/onnx/symbolic_opset9.py", line 2065, in to return g.op("Cast", self, to_i=sym_help.cast_pytorch_to_onnx[dtype]) KeyError: 'UNKNOWN_SCALAR'


5. please simplify the steps as much as possible so they do not require additional resources to
   run, such as a private dataset.

## Expected behavior:

I expected that the official code [export_model.py](https://github.com/facebookresearch/detectron2/blob/main/tools/deploy/export_model.py) can help me to export onnx model.

## Environment:

Provide your environment information using the following command:

(same as the above)



If your issue looks like an installation issue / environment issue,

please first try to solve it yourself with the instructions in

https://detectron2.readthedocs.io/tutorials/install.html#common-installation-issues
shining-love commented 2 years ago

I met the same error!have you solved it?

FrancescoMandru commented 2 years ago

There are some pull requests that will be merged in the next days to solve these issues.

ghost commented 2 years ago

I met the same error! have you solved it?

No, it has not been solved.

ghost commented 2 years ago

There are some pull requests that will be merged in the next few days to solve these issues.

Thanks a lot!

I will close this issue after it is solved, and I am focusing on this problem.

azhurkevich commented 2 years ago

We have couple projects of our own broken because of this. Would be nice to know when we can pip install 'git+https://github.com/facebookresearch/detectron2.git' and not have this issue.

azhurkevich commented 2 years ago

Not a solution. Steps:

pip install torch==1.12.0 torchvision --extra-index-url https://download.pytorch.org/whl/cu116 \
&& pip install 'git+https://github.com/facebookresearch/fvcore' \
&& pip install git+https://github.com/facebookresearch/detectron2.git 

Still getting:

ImportError: cannot import name 'Caffe2Tracer' from 'detectron2.export' (/usr/local/lib/python3.8/dist-packages/detectron2/export/__init__.py)
VladMVLX commented 2 years ago

having the same problem with PointRend model, any solution?

azhurkevich commented 2 years ago

Maybe @ppwwyyxx can give us any tips. This also doesn't work for Detectron 2 Mask R-CNN R50-FPN 3x in TensorRT ONNX export

anshudaur commented 2 years ago

I am having the same problem with Detic :

python3.9/site-packages/torch/onnx/symbolic_opset9.py", line 2065, in to return g.op("Cast", self, to_i=sym_help.cast_pytorch_to_onnx[dtype]) KeyError: 'UNKNOWN_SCALAR' Tried following : STABLE_ONNX_OPSET_VERSION = 11,12,13,14

These are the warnings that I get when export the Detic Model

[08/02 12:46:11 detectron2]: Command line arguments: Namespace(format='onnx', export_method='tracing', config_file='configs/Detic_LbaseI_CLIP_SwinB_896b32_4x_ft4x_max-size.yaml', sample_image=None, run_eval=False, output='models/', opts=['MODEL.WEIGHTS', 'models/Detic_LbaseI_CLIP_R5021k_640b64_4x_ft4x_max-size.pkl', 'MODEL.DEVICE', 'cpu']) [W init.cpp:759] Warning: Use _jit_set_fusion_strategy, bailout depth is deprecated. Setting to (STATIC, 1) (function operator()) WARNING [08/02 12:46:14 d2.checkpoint.c2_model_loading]: Shape of backbone.fpn_lateral3.weight in checkpoint is torch.Size([256, 512, 1, 1]), while shape of backbone.fpn_lateral3.weight in model is torch.Size([256, 256, 1, 1]). WARNING [08/02 12:46:14 d2.checkpoint.c2_model_loading]: backbone.fpn_lateral3.weight will not be loaded. Please double check and see if this is desired. WARNING [08/02 12:46:14 d2.checkpoint.c2_model_loading]: Shape of backbone.fpn_lateral4.weight in checkpoint is torch.Size([256, 1024, 1, 1]), while shape of backbone.fpn_lateral4.weight in model is torch.Size([256, 512, 1, 1]). WARNING [08/02 12:46:14 d2.checkpoint.c2_model_loading]: backbone.fpn_lateral4.weight will not be loaded. Please double check and see if this is desired. WARNING [08/02 12:46:14 d2.checkpoint.c2_model_loading]: Shape of backbone.fpn_lateral5.weight in checkpoint is torch.Size([256, 2048, 1, 1]), while shape of backbone.fpn_lateral5.weight in model is torch.Size([256, 1024, 1, 1]). WARNING [08/02 12:46:14 d2.checkpoint.c2_model_loading]: backbone.fpn_lateral5.weight will not be loaded. Please double check and see if this is desired.

frankvp11 commented 2 years ago

any updates on this? I'm still facing this problem and I am wondering when this will be fixed?

ghost commented 2 years ago

any updates on this? I'm still facing this problem and I am wondering when this will be fixed?

No, it has not been fixed.

shining-love commented 2 years ago

any updates on this? I'm still facing this problem and I am wondering when this will be fixed?

never be fixed!Obviously,the repo will be died!no one will provide some help!

frankvp11 commented 2 years ago

Ive tried using an older commit of detectron2 but it seems like it doesn't work then either. Anyone got any ideas on how to fix or not?

suhanshaikh33 commented 2 years ago

@ppwwyyxx please can you help on this . failing to export to onnx. tried to modify with dummy in=1,3,224,224 still error

frankvp11 commented 2 years ago

Any updates yet? Hows the progress coming on those PR's @FrancescoMandru

FrancescoMandru commented 2 years ago

Any updates yet? Hows the progress coming on those PR's @FrancescoMandru

I'm not a Meta employee, I collaborated with a ONNX merge general for detectron2 in https://github.com/facebookresearch/detectron2/pull/4291

frankvp11 commented 2 years ago

Oh, didn't know that-because Im still dealing with bugs and what-not. Im now downgrading to v0.6 release instead of newest and seeing how that goes.

vincedupuis commented 2 years ago

Hello,

I've compiled and install pytorch 1.12.1 from source with the BUILD_CAFFE2=1 flag. Also, I needed to install version 1.8 of onnx not the latest one with pip install onnx==1.8 After that the error "ImportError: cannot import name 'Caffe2Tracer" is gone but got another error:

Traceback (most recent call last): File "/home/ubuntu/detectron2/tools/deploy/./export_model.py", line 222, in exported_model = export_caffe2_tracing(cfg, torch_model, sample_inputs) File "/home/ubuntu/detectron2/tools/deploy/./export_model.py", line 43, in export_caffe2_tracing tracer = Caffe2Tracer(cfg, torch_model, inputs) File "/home/ubuntu/detectron2/detectron2/export/api.py", line 60, in init C2MetaArch = META_ARCH_CAFFE2_EXPORT_TYPE_MAP[cfg.MODEL.META_ARCHITECTURE] KeyError: 'PanopticFPN'

Any ideas?

ghost commented 2 years ago

I will not leave and keep the focus on until this problem has been solved.

thiagocrepaldi commented 2 years ago

Will look into this one today

thiagocrepaldi commented 2 years ago

Summary

After looking into this issue, I've managed to export it to ONNX using both 1) standard ONNX tracing (aka --export-method tracing) without any code change and 2) Caffe2 ONNX with one line change, though I didn't take a look into the caffe2 onnx graph to check if it looks sound (would guess not due to the size compared to the standard onnx, but I am not familiar with caffe2). Any volunteer for that part?

Official standard ONNX

Install pytorch 1.12.1 (no need to build from source as Caffe2 is not used), torchvision 0.13 and latest detectron2 master branch. Next, cd into tools/deploy and run python3 export_model.py --config-file ../../configs/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x.yaml --output onnx_panoptic_fpn_R_50_1x/ --export-method tracing --format onnx

The exported standard ONNX using tracing is below: model

Workaround for Caffe2 ONNX

Edit https://github.com/facebookresearch/detectron2/blob/main/detectron2/export/caffe2_modeling.py#L416-L419 and change from

META_ARCH_CAFFE2_EXPORT_TYPE_MAP = {
    "GeneralizedRCNN": Caffe2GeneralizedRCNN,
    "RetinaNet": Caffe2RetinaNet,
}

to

META_ARCH_CAFFE2_EXPORT_TYPE_MAP = {
    "PanopticFPN": Caffe2GeneralizedRCNN,
    "GeneralizedRCNN": Caffe2GeneralizedRCNN,
    "RetinaNet": Caffe2RetinaNet,
}

I have done that because PanopticFPN inherits from GeneralizedRCNN, so I assumed Caffe2GeneralizedRCNN could work - and it did, apparently.

Build pytorch master from source using something like BUILD_CAFFE21 python setup.py install, install torchvision nightly and build latest detectron2 master branch with the change above. Next, cd into tools/deploy and run python3 export_model.py --config-file ../../configs/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x.yaml --output onnx_panoptic_fpn_R_50_1x/ --export-method caffe2_tracing --format onnx

The exported Caffe2 ONNX using tracing is below: model

thiagocrepaldi commented 2 years ago

If you are not using libcaffe2 to run the ONNX model, do not export using caffe2_tracing, as it adds non-ONNX nodes that only caffe2 can handle

Also, I could not repro the KeyError: 'UNKNOWN_SCALAR'. That is probably because I used latest pytorch version (master and 1.12.1) in which Microsoft ONNX converter team added lots of fix recently

For recent pytorch version, you may need to update fairseq as shown below: my_conda_env/lib/python3.9/site-packages/fairscale-0.4.8-py3.9.egg/fairscale/nn/pipe/rpc.py from from torch.distributed.distributed_c10d import _get_global_rank to from torch.distributed.distributed_c10d import get_global_rank

thiagocrepaldi commented 2 years ago

I will push a PR that handles the ImportError: cannot import name 'Caffe2Tracer' from 'detectron2.export' (/usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/export/__init__.py) in a more informative way, letting user know it needs to build torch from source with BUILD_CAFFE2=1 set

Maybe some updates to deployment.md. Any suggestions on this last part?

ppwwyyxx commented 2 years ago

Thanks @thiagocrepaldi for looking at this. To add some corrections:

(As open source doesn't feel like a big priority for this project any more, if anyone's passionate about it, it might be easier to start a separate repo to host deploy-specific parts (basically a more powerful version of tools/deploy/) based on detectron2. I can certainly provide some consultancy if needed. )

thiagocrepaldi commented 2 years ago

Thanks @thiagocrepaldi for looking at this. To add some corrections:

  • For tracing mode, these lines should be different because PanopticFPN.inference returns a different format (see its docstring for details). But other than that I think there is no fundamental blockers to make it work.
  • For the deprecated caffe2_tracing mode, it may run successfully but it won't export a correct model. It will only export the "Mask R-CNN subpart" of a PanopticFPN.

(As open source doesn't feel like a big priority for this project any more, if anyone's passionate about it, it might be easier to start a separate repo to host deploy-specific parts (basically a more powerful version of tools/deploy/) based on detectron2. I can certainly provide some consultancy if needed. )

Thank you, @ppwwyyxx, will look into the docstring and propose something to get these guys going

There is some discussion with Meta proposing Microsoft to help supporting ONNX export part of detectron2 (I think we briefly discussed that some time ago). Let's see what comes from it. If there is no deal, we certainly can start such detectron2 onnx zoo (another idea within microsoft - but for any model, not just detectron2)

thiagocrepaldi commented 2 years ago

https://github.com/facebookresearch/detectron2/pull/4520 can be used to experiment with standard onnx export

vincedupuis commented 2 years ago

Hi, I got this log when I tried to export to onnx with caffe2_tracing:

python export_model.py --sample-image tm-onnx.png --config-file ~/detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --export-method caffe2_tracing --format onnx --output ./output MODEL.WEIGHTS model_final_f10217.pkl MODEL.DEVICE cuda /home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory warn(f"Failed to load image Python extension: {e}") WARNING:root:Pytorch pre-release version 1.12.0a0+git664058f - assuming intent to test it WARNING:root:Pytorch pre-release version 1.12.0a0+git664058f - assuming intent to test it [09/02 00:17:03 detectron2]: Command line arguments: Namespace(format='onnx', export_method='caffe2_tracing', config_file='/home/ubuntu/detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml', sample_image='tm-onnx.png', run_eval=False, output='./output', opts=['MODEL.WEIGHTS', 'model_final_f10217.pkl', 'MODEL.DEVICE', 'cuda']) [W init.cpp:753] Warning: Use _jit_set_fusion_strategy, bailout depth is deprecated. Setting to (STATIC, 1) (function operator()) /home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/torch/onnx/utils.py:374: UserWarning: add_node_names' can be set to True only when 'operator_export_type' isONNX. Since 'operator_export_type' is not set to 'ONNX',add_node_namesargument will be ignored. warnings.warn( /home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/torch/onnx/utils.py:374: UserWarning:do_constant_folding' can be set to True only when 'operator_export_type' is ONNX. Since 'operator_export_type' is not set to 'ONNX', do_constant_folding argument will be ignored. warnings.warn( /home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/detectron2/export/c10.py:32: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert tensor.dim() == 2 and tensor.size(-1) in [4, 5, 6], tensor.size() /home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/detectron2/export/c10.py:377: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert roi_feat_shuffled.numel() > 0 and rois_idx_restore_int32.numel() > 0, ( /home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/detectron2/export/c10.py:409: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if num_classes + 1 == class_logits.shape[1]: /home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/detectron2/export/c10.py:418: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert box_regression.shape[1] % box_dim == 0 /home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/detectron2/export/c10.py:419: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). cls_agnostic_bbox_reg = box_regression.shape[1] // box_dim == 1 /home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/detectron2/export/c10.py:425: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if input_tensor_mode: /home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/detectron2/export/c10.py:486: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results). for i, b in enumerate(int(x.item()) for x in roi_batch_splits_nms) /home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/detectron2/export/c10.py:486: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! for i, b in enumerate(int(x.item()) for x in roi_batch_splits_nms) /home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/detectron2/export/c10.py:104: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results. return len(self.indices) /home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/detectron2/export/c10.py:87: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert ( WARNING: The shape inference of _caffe2::AliasWithName type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::AliasWithName type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::AliasWithName type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::ResizeNearest type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::ResizeNearest type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::ResizeNearest type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CollectRpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::RoIAlign type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::RoIAlign type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::RoIAlign type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::RoIAlign type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BatchPermutation type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BBoxTransform type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BBoxTransform type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BoxWithNMSLimit type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BoxWithNMSLimit type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BoxWithNMSLimit type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BoxWithNMSLimit type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BoxWithNMSLimit type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BoxWithNMSLimit type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::AliasWithName type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::AliasWithName type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::AliasWithName type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::RoIAlign type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::RoIAlign type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::RoIAlign type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::RoIAlign type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BatchPermutation type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::AliasWithName type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::AliasWithName type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::AliasWithName type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::AliasWithName type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::ResizeNearest type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::ResizeNearest type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::ResizeNearest type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::GenerateProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CollectRpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::RoIAlign type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::RoIAlign type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::RoIAlign type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::RoIAlign type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BatchPermutation type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BBoxTransform type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BBoxTransform type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BoxWithNMSLimit type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BoxWithNMSLimit type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BoxWithNMSLimit type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BoxWithNMSLimit type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BoxWithNMSLimit type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BoxWithNMSLimit type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::AliasWithName type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::AliasWithName type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::AliasWithName type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyGPUToCPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::DistributeFpnProposals type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::CopyCPUToGPU type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::RoIAlign type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::RoIAlign type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::RoIAlign type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::RoIAlign type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::BatchPermutation type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of _caffe2::AliasWithName type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. Traceback (most recent call last): File "/home/ubuntu/TensorRT/samples/python/detectron2/export_model.py", line 222, in exported_model = export_caffe2_tracing(cfg, torch_model, sample_inputs) File "/home/ubuntu/TensorRT/samples/python/detectron2/export_model.py", line 53, in export_caffe2_tracing onnx_model = tracer.export_onnx() File "/home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/detectron2/export/api.py", line 94, in export_onnx return export_onnx_model_impl(self.traceable_model, (self.traceable_inputs,)) File "/home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/detectron2/export/caffe2_export.py", line 65, in export_onnx_model onnx_model = onnx.load_from_string(f.getvalue()) File "/home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/onnx/init.py", line 156, in load_model_from_string return _deserialize(s, ModelProto()) File "/home/ubuntu/miniconda3/envs/tensorrt/lib/python3.10/site-packages/onnx/init.py", line 97, in _deserialize decoded = cast(Optional[int], proto.ParseFromString(s)) google.protobuf.message.DecodeError: Error parsing message with type 'onnx.ModelProto'

Any ideas?

solarflarefx commented 1 year ago

@vincedupuis I seem to be getting this consistently: ImportError: cannot import name 'STABLE_ONNX_OPSET_VERSION' from 'detectron2.export'

Did you run across this error? Any idea what it could be?

thiagocrepaldi commented 1 year ago

STABLE_ONNX_OPSET_VERSION is defined at https://github.com/facebookresearch/detectron2/blob/main/detectron2/export/__init__.py#L19 since PR #4291 on July 15th

Make sure you are using main branch and you should be good to go

jonas-doevenspeck commented 1 year ago

Summary

After looking into this issue, I've managed to export it to ONNX using both 1) standard ONNX tracing (aka --export-method tracing) without any code change and 2) Caffe2 ONNX with one line change, though I didn't take a look into the caffe2 onnx graph to check if it looks sound (would guess not due to the size compared to the standard onnx, but I am not familiar with caffe2). Any volunteer for that part?

Official standard ONNX

Install pytorch 1.12.1 (no need to build from source as Caffe2 is not used), torchvision 0.13 and latest detectron2 master branch. Next, cd into tools/deploy and run python3 export_model.py --config-file ../../configs/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x.yaml --output onnx_panoptic_fpn_R_50_1x/ --export-method tracing --format onnx

The exported standard ONNX using tracing is below: model

Workaround for Caffe2 ONNX

Edit https://github.com/facebookresearch/detectron2/blob/main/detectron2/export/caffe2_modeling.py#L416-L419 and change from

META_ARCH_CAFFE2_EXPORT_TYPE_MAP = {
    "GeneralizedRCNN": Caffe2GeneralizedRCNN,
    "RetinaNet": Caffe2RetinaNet,
}

to

META_ARCH_CAFFE2_EXPORT_TYPE_MAP = {
    "PanopticFPN": Caffe2GeneralizedRCNN,
    "GeneralizedRCNN": Caffe2GeneralizedRCNN,
    "RetinaNet": Caffe2RetinaNet,
}

I have done that because PanopticFPN inherits from GeneralizedRCNN, so I assumed Caffe2GeneralizedRCNN could work - and it did, apparently.

Build pytorch master from source using something like BUILD_CAFFE21 python setup.py install, install torchvision nightly and build latest detectron2 master branch with the change above. Next, cd into tools/deploy and run python3 export_model.py --config-file ../../configs/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x.yaml --output onnx_panoptic_fpn_R_50_1x/ --export-method caffe2_tracing --format onnx

The exported Caffe2 ONNX using tracing is below: model

@thiagocrepaldi thanks for the instructions. Question, do you know where the If operators in the ONNX graph come from? I can't find the corresponding source code in detectron2. image image

thiagocrepaldi commented 1 year ago

As @ppwwyyxx mentioned before, this hack for Caffe2 only exports the "Mask R-CNN subpart" part of the PanopticFPN, and #4520 raises an exception to let users know about this limitation

Regarding the "where the If came from", append verbose=True to torch.onnx.export call and re-export it. Open Netron and click on the If (or any other node you want to see where it came from. A description field with the trace will display which file:line that operator came from

jonas-doevenspeck commented 1 year ago

As @ppwwyyxx mentioned before, this hack for Caffe2 only exports the "Mask R-CNN subpart" part of the PanopticFPN, and #4520 raises an exception to let users know about this limitation

Regarding the "where the If came from", append verbose=True to torch.onnx.export call and re-export it. Open Netron and click on the If (or any other node you want to see where it came from. A description field with the trace will display which file:line that operator came from

@thiagocrepaldi thanks for your tip regarding verbose=True. Does https://github.com/facebookresearch/detectron2/pull/4520 aim to enable onnx export of the 'full' PanopticFPN?

thiagocrepaldi commented 1 year ago

As @ppwwyyxx mentioned before, this hack for Caffe2 only exports the "Mask R-CNN subpart" part of the PanopticFPN, and #4520 raises an exception to let users know about this limitation Regarding the "where the If came from", append verbose=True to torch.onnx.export call and re-export it. Open Netron and click on the If (or any other node you want to see where it came from. A description field with the trace will display which file:line that operator came from

@thiagocrepaldi thanks for your tip regarding verbose=True. Does #4520 aim to enable onnx export of the 'full' PanopticFPN?

full export for non-caffe2 mode

jonas-doevenspeck commented 1 year ago

@thiagocrepaldi thanks I managed to export the PanopticFPN using your branch. Regarding the If operator following the first NMS in the onnx graph:

image

It points to this line https://github.com/pytorch/vision/blob/07ae61bf9c21ddd1d5f65d326aa9636849b383ca/torchvision/ops/boxes.py#L89

After disabling this line in the source code (for debugging purposes), the If operator is still there but pointing to another line:

image

https://github.com/pytorch/vision/blob/07ae61bf9c21ddd1d5f65d326aa9636849b383ca/torchvision/ops/boxes.py#L41

Curiously, when I export the nms in a standalone network (so simple pure pytorch network), there is no If operator in the onnx graph. Do you have an idea where this originates from?

In the network, the If op. always (10 times) occurs in the same subgraph so I wonder if this subgraph gets expanded from a single LOC in pytorch/python? image

thiagocrepaldi commented 1 year ago

@thiagocrepaldi thanks I managed to export the PanopticFPN using your branch. Regarding the If operator following the first NMS in the onnx graph:

image

It points to this line https://github.com/pytorch/vision/blob/07ae61bf9c21ddd1d5f65d326aa9636849b383ca/torchvision/ops/boxes.py#L89

After disabling this line in the source code (for debugging purposes), the If operator is still there but pointing to another line:

image

https://github.com/pytorch/vision/blob/07ae61bf9c21ddd1d5f65d326aa9636849b383ca/torchvision/ops/boxes.py#L41

Curiously, when I export the nms in a standalone network (so simple pure pytorch network), there is no If operator in the onnx graph. Do you have an idea where this originates from?

In the network, the If op. always (10 times) occurs in the same subgraph so I wonder if this subgraph gets expanded from a single LOC in pytorch/python? image

I am glad the PR worked out for you, but I am not sure if I undertood what you meant by "After disabling this line in the source code (for debugging purposes), the If operator is still there but pointing to another line". If you comment it out this if, you change the graph and not necessarily If_1610 in one run matches the other. The numbering is kind of sequential counter, not actually connected to the source-code

htlbayytq commented 1 year ago

Meanwhile, I have tried tracing method, after the above caffe2_tracing method.

Instructions To Reproduce the bug Bug:

1. Full runnable code or full changes you made:
I use the original repo and do not change anything.
2. What exact command do you run:
export DETECTRON2_DATASETS=/data/datasets/

python3 export_model.py \
     --config-file ../../configs/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x.yaml \
     --output /data/output/ \
     --export-method tracing \
     --format onnx \
     MODEL.WEIGHTS /data/model/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x/139514544/model_final_dbfeb4.pkl \
     MODEL.DEVICE cpu

I have already prepared the COCO dataset in /data/datasets/.

4. **Full logs** or other relevant observations:
[06/23 06:47:50 detectron2]: Command line arguments: Namespace(format='onnx', export_method='tracing', config_file='../../configs/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x.yaml', sample_image=None, run_eval=False, output='/data/output/', opts=['MODEL.WEIGHTS', '/data/model/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x/139514544/model_final_dbfeb4.pkl', 'MODEL.DEVICE', 'cpu'])
[W init.cpp:759] Warning: Use _jit_set_fusion_strategy, bailout depth is deprecated. Setting to (STATIC, 1) (function operator())
[06/23 06:47:52 d2.data.datasets.coco]: Loaded 5000 images in COCO format from /data/datasets/coco/annotations/instances_val2017.json
[06/23 06:47:52 d2.data.datasets.coco]: Loaded 5000 images with semantic segmentation from /data/datasets/coco/val2017
[06/23 06:47:52 d2.data.build]: Distribution of instances among all 80 categories:
|   category    | #instances   |   category   | #instances   |   category    | #instances   |
|:-------------:|:-------------|:------------:|:-------------|:-------------:|:-------------|
|    person     | 10777        |   bicycle    | 314          |      car      | 1918         |
|  motorcycle   | 367          |   airplane   | 143          |      bus      | 283          |
|     train     | 190          |    truck     | 414          |     boat      | 424          |
| traffic light | 634          | fire hydrant | 101          |   stop sign   | 75           |
| parking meter | 60           |    bench     | 411          |     bird      | 427          |
|      cat      | 202          |     dog      | 218          |     horse     | 272          |
|     sheep     | 354          |     cow      | 372          |   elephant    | 252          |
|     bear      | 71           |    zebra     | 266          |    giraffe    | 232          |
|   backpack    | 371          |   umbrella   | 407          |    handbag    | 540          |
|      tie      | 252          |   suitcase   | 299          |    frisbee    | 115          |
|     skis      | 241          |  snowboard   | 69           |  sports ball  | 260          |
|     kite      | 327          | baseball bat | 145          | baseball gl.. | 148          |
|  skateboard   | 179          |  surfboard   | 267          | tennis racket | 225          |
|    bottle     | 1013         |  wine glass  | 341          |      cup      | 895          |
|     fork      | 215          |    knife     | 325          |     spoon     | 253          |
|     bowl      | 623          |    banana    | 370          |     apple     | 236          |
|   sandwich    | 177          |    orange    | 285          |   broccoli    | 312          |
|    carrot     | 365          |   hot dog    | 125          |     pizza     | 284          |
|     donut     | 328          |     cake     | 310          |     chair     | 1771         |
|     couch     | 261          | potted plant | 342          |      bed      | 163          |
| dining table  | 695          |    toilet    | 179          |      tv       | 288          |
|    laptop     | 231          |    mouse     | 106          |    remote     | 283          |
|   keyboard    | 153          |  cell phone  | 262          |   microwave   | 55           |
|     oven      | 143          |   toaster    | 9            |     sink      | 225          |
| refrigerator  | 126          |     book     | 1129         |     clock     | 267          |
|     vase      | 274          |   scissors   | 36           |  teddy bear   | 190          |
|  hair drier   | 11           |  toothbrush  | 57           |               |              |
|     total     | 36335        |              |              |               |              |
[06/23 06:47:52 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1333, sample_style='choice')]
[06/23 06:47:52 d2.data.common]: Serializing 5000 elements to byte tensors and concatenating them all ...
[06/23 06:47:53 d2.data.common]: Serialized dataset takes 19.52 MiB
/usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/structures/image_list.py:85: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert t.shape[:-2] == tensors[0].shape[:-2], t.shape
/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py:2498: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  _verify_batch_size([input.size(0) * input.size(1) // num_groups, num_groups] + list(input.size()[2:]))
/usr/local/lib/python3.10/dist-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  ../aten/src/ATen/native/TensorShape.cpp:2228.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
/usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/structures/boxes.py:155: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert tensor.dim() == 2 and tensor.size(-1) == 4, tensor.size()
/usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/modeling/proposal_generator/proposal_utils.py:106: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if not valid_mask.all():
/usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/structures/boxes.py:191: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert torch.isfinite(self.tensor).all(), "Box tensor contains infinite or NaN!"
/usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/structures/boxes.py:192: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
  h, w = box_size
/usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/layers/nms.py:15: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert boxes.shape[-1] == 4
/usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/structures/instances.py:74: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
  data_len = len(value)
/usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/modeling/poolers.py:231: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert len(box_lists) == x[0].size(
/usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/layers/roi_align.py:55: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert rois.dim() == 2 and rois.size(1) == 5
/usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/modeling/roi_heads/fast_rcnn.py:138: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if not valid_mask.all():
/usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/modeling/roi_heads/fast_rcnn.py:143: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  num_bbox_reg_classes = boxes.shape[1] // 4
/usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/modeling/roi_heads/fast_rcnn.py:155: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if num_bbox_reg_classes == 1:
/usr/local/lib/python3.10/dist-packages/detectron2-0.6-py3.10-linux-x86_64.egg/detectron2/modeling/roi_heads/mask_head.py:139: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if cls_agnostic_mask:
Traceback (most recent call last):
  File "/data/detectron2/tools/deploy/export_model.py", line 221, in <module>
    exported_model = export_tracing(torch_model, sample_inputs)
  File "/data/detectron2/tools/deploy/export_model.py", line 127, in export_tracing
    torch.onnx.export(traceable_model, (image,), f, opset_version=11)
  File "/usr/local/lib/python3.10/dist-packages/torch/onnx/__init__.py", line 305, in export
    return utils.export(model, args, f, export_params, verbose, training,
  File "/usr/local/lib/python3.10/dist-packages/torch/onnx/utils.py", line 118, in export
    _export(model, args, f, export_params, verbose, training, input_names, output_names,
  File "/usr/local/lib/python3.10/dist-packages/torch/onnx/utils.py", line 719, in _export
    _model_to_graph(model, args, verbose, input_names,
  File "/usr/local/lib/python3.10/dist-packages/torch/onnx/utils.py", line 503, in _model_to_graph
    graph = _optimize_graph(graph, operator_export_type,
  File "/usr/local/lib/python3.10/dist-packages/torch/onnx/utils.py", line 232, in _optimize_graph
    graph = torch._C._jit_pass_onnx(graph, operator_export_type)
  File "/usr/local/lib/python3.10/dist-packages/torch/onnx/__init__.py", line 354, in _run_symbolic_function
    return utils._run_symbolic_function(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/onnx/utils.py", line 1061, in _run_symbolic_function
    return symbolic_fn(g, *inputs, **attrs)
  File "/usr/local/lib/python3.10/dist-packages/torch/onnx/symbolic_opset9.py", line 2065, in to
    return g.op("Cast", self, to_i=sym_help.cast_pytorch_to_onnx[dtype])
KeyError: 'UNKNOWN_SCALAR'
5. please simplify the steps as much as possible so they do not require additional resources to
   run, such as a private dataset.

Expected behavior:

I expected that the official code export_model.py can help me to export onnx model.

Environment:

Provide your environment information using the following command:

(same as the above)

If your issue looks like an installation issue / environment issue,

please first try to solve it yourself with the instructions in

https://detectron2.readthedocs.io/tutorials/install.html#common-installation-issues

Hi ghost, have you already solve the problem? I met the same error:" KeyError: 'UNKNOWN_SCALAR' "

And I tried a lot for a long time, but still didn't find a way to solve it...................................

Zalways commented 6 months ago

i met similar issue when i use the exported cuda onnx model to inference on cuda: onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running TopK node. Name:'/model/TopK' Status Message: CUDA error cudaErrorInvalidConfiguration:invalid configuration argument

can you help me with my problem? @ppwwyyxx @thiagocrepaldi