Closed brantPTS closed 2 years ago
https://github.com/onnx/onnx/blob/main/docs/Operators.md#Clip Before opset 13 "clip" does not support tensor(int64). Can you try newer version of opset?
@wangyems, thank you for your prompt reply. Upgrading the Detectron2 python script to use opset 13 does solve the load problem.
My new problem is that the detections appear to be totally scrambled - although the number of objects is reasonable for the input image, the classes and locations are invalid. Is there an easy way to inspect the feature maps of an onnx session?
Thank you again for your prompt help.
Best, Brant
Since onnx model is static graph, you need to some extra work to inspect intermediate values. There are generally two ways:
or
More to check: Are the onnx's raw inputs and outputs matched with the original framework? Does CPU ep generate same results as CUDA ep?
@wangyems, thank you, that is very helpful guidance. I will certainly scrutinize the input / output data. If that does note work, I'll use the ORT debug_node mode - that sounds extremely useful. Closing this issue.
Load issue was resolved
@brantPTS Could you please share details of this? Any suggestion/knowledge sharing would be great! Thank you in advance.
@brantPTS Could you please share details of this? Any suggestion/knowledge sharing would be great! Thank you in advance.
@AidenFather , I do not have any updates - the next step would be to scrutinize layer outputs to see where the PyTorch + python inference diverges from the onnx runtime inference. It's a shame that the D2 onnx model does not work on ORT
Detectron2 is Facebook AI's featured object detection model and it supports ONNX export, but session load fails with the Cuda execution provider.
See below for steps to reproduce. Thank you.
System info: Edition Windows 10 Pro Version 21H1 OS build 19043.1645 Processor Intel(R) Core(TM) i7-9700 CPU @ 3.00GHz 3.00 GHz Installed RAM 32.0 GB System type 64-bit operating system, x64-based processor
GPU: Nvidia GTX 1080ti Nvidia Graphics 466.27
Steps to reproduce on Windows 10:
Install Detectron2 on Windows:
Create python environment and activate: [python install path]Python39\python.exe -m venv d:\local\Envs\Det2 --copies d:\local\Envs\Det2\scripts\activate.bat
Install PyTorch: pip3 install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio===0.11.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html git clone https://github.com/facebookresearch/detectron2.git
Ensure CMake is installed (I have 3.22.0 rc2)
Ensure Nvidia GPU Computing toolkit is installed (I have 11.4)
Ensure matching CuDNN .dll, .bin, *.h files are copied into GPU Computing toolkit bin, lib, include folders -- For example, I copied --- D:\Local\CuDNN\cudnn-11.4-windows-x64-v8.2.2.26\cuda\bin\cudnn_ops_train64_8.dll (and all other dlls) to: --- C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4\bin\cudnn_ops_train64_8.dll -- and copied: -- D:\Local\CuDNN\cudnn-11.4-windows-x64-v8.2.2.26\cuda\lib\x64\cudnn_ops_train.lib to: --- C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4\lib\x64\cudnn_ops_train.lib -- and copied: --- D:\Local\CuDNN\cudnn-11.4-windows-x64-v8.2.2.26\cuda\include\cudnn_ops_train.h to: --- D:\Local\CuDNN\cudnn-11.4-windows-x64-v8.2.2.26\cuda\include\cudnn_ops_train.h
-- Ensure OpenCV is installed and set environment variable, such as OpenCV_DIR = d:\Local\opencv\build
In a Visual Studio 2019 command prompt: set DISTUTILS_USE_SDK=1 d: cd d:\Local\Detectron d:\local\Envs\Det2\scripts\activate.bat python -m pip install -e detectron2
Ensure you have coco data (LARGE amount of data, used by export_model.py for some reason) D:\Local\Detectron\detectron2\datasets\coco\annotations D:\Local\Detectron\detectron2\datasets\coco\val2017
In a normal command prompt, generate onnx model: d: cd d:\Local\Detectron d:\local\Envs\Det2\scripts\activate.bat D:\Local\Detectron\detectron2>python .\tools\deploy\export_model.py --config-file ./configs/COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml --output ./outputOnnx --export-method tracing --format onnx MODEL.WEIGHTS detectron2://COCO-Detection/faster_rcnn_R_50_FPN_3x/137849458/model_final_280758.pkl MODEL.DEVICE cuda
Output log should look like:
[03/22 05:37:49 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1333, sample_style='choice')] [03/22 05:37:49 d2.data.common]: Serializing 5000 elements to byte tensors and concatenating them all ... [03/22 05:37:49 d2.data.common]: Serialized dataset takes 19.10 MiB d:\local\detectron\detectron2\detectron2\structures\image_list.py:79: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert t.shape[:-2] == tensors[0].shape[:-2], t.shape d:\local\Envs\Det2\lib\site-packages\torch\functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\TensorShape.cpp:2228.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] d:\local\detectron\detectron2\detectron2\structures\boxes.py:148: TracerWarning: torch.as_tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect. tensor = torch.as_tensor(tensor, dtype=torch.float32, device=device) d:\local\detectron\detectron2\detectron2\structures\boxes.py:153: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert tensor.dim() == 2 and tensor.size(-1) == 4, tensor.size() d:\local\detectron\detectron2\detectron2\modeling\proposal_generator\proposal_utils.py:97: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if not valid_mask.all(): d:\local\detectron\detectron2\detectron2\structures\boxes.py:189: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert torch.isfinite(self.tensor).all(), "Box tensor contains infinite or NaN!" d:\local\detectron\detectron2\detectron2\structures\boxes.py:190: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results). h, w = box_size d:\local\detectron\detectron2\detectron2\layers\nms.py:15: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert boxes.shape[-1] == 4 d:\local\detectron\detectron2\detectron2\structures\instances.py:74: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results. data_len = len(value) d:\local\detectron\detectron2\detectron2\modeling\poolers.py:211: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert len(box_lists) == x[0].size( d:\local\detectron\detectron2\detectron2\layers\roi_align.py:55: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert rois.dim() == 2 and rois.size(1) == 5 d:\local\detectron\detectron2\detectron2\modeling\roi_heads\fast_rcnn.py:137: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if not valid_mask.all(): d:\local\detectron\detectron2\detectron2\modeling\roi_heads\fast_rcnn.py:142: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). num_bbox_reg_classes = boxes.shape[1] // 4 d:\local\detectron\detectron2\detectron2\modeling\roi_heads\fast_rcnn.py:154: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if num_bbox_reg_classes == 1: WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
[repeated warning]
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. d:\local\Envs\Det2\lib\site-packages\torch\onnx\symbolic_opset9.py:2905: UserWarning: Exporting aten::index operator of advanced indexing in opset 11 is achieved by combination of multiple ONNX operators, including Reshape, Transpose, Concat, and Gather. If indices include negative values, the exported graph will produce incorrect results. warnings.warn("Exporting aten::index operator of advanced indexing in opset " + WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
[repeated warning]
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. d:\local\Envs\Det2\lib\site-packages\torchvision\ops_register_onnx_ops.py:31: UserWarning: ROIAlign with aligned=True is not supported in ONNX, but will be supported in opset 16. The workaround is that the user need apply the patch https://github.com/microsoft/onnxruntime/pull/8564 and build ONNXRuntime from source. warnings.warn( WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
[repeated warning]
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. [03/22 05:38:04 detectron2]: Inputs schema: TupleSchema(schemas=[ListSchema(schemas=[DictSchema(schemas=[IdentitySchema()], sizes=[1], keys=['image'])], sizes=[1])], sizes=[1]) [03/22 05:38:04 detectron2]: Outputs schema: ListSchema(schemas=[DictSchema(schemas=[InstancesSchema(schemas=[TensorWrapSchema(class_name='detectron2.structures.Boxes'), IdentitySchema(), IdentitySchema()], sizes=[1, 1, 1], keys=['pred_boxes', 'pred_classes', 'scores'])], sizes=[4], keys=['instances'])], sizes=[4])
(Det2) D:\Local\Detectron\detectron2>
In a c# console application that references Ort version 1.10.0, try to create an onnx session: Session = new InferenceSession(modelPath, SessionOptions.MakeSessionOptionWithCudaProvider(gpuIndex));