Open vladb99 opened 10 months ago
Hi @vladb99 What SG version do you have?
[12/07/2023-13:59:05] [W] [TRT] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
This most likely corresponds to some arange
operator that outputs numbers as long
in converting box regression offsets to absolute pixel units. I believe this should have been addressed recently.
- Can't parse 'pt1'. Sequence item with index 0 has a wrong type
- Can't parse 'pt1'. Sequence item with index 0 has a wrong type
- Can't parse 'rec'. Expected sequence length 4, got 2
- Can't parse 'rec'. Expected sequence length 4, got 2
These errors indicate that pt1 and pt2 has wrong dtype. OpenCV can draw rectangles when coordinates given as (int,int). So you may want to cast them explicitly to int
. Probably you are using old release of SG where PoseVisualization.draw_poses
hasn't been updated to do this casting for you.
So my suggestions are to take latest build of SG and try with it.
Anyway, we are going to make an official Colab demo showing how to run the our models using TRT. Stay tuned for that.
What SG version do you have?
super-gradients 3.5.0
These errors indicate that pt1 and pt2 has wrong dtype. OpenCV can draw rectangles when >coordinates given as (int,int). So you may want to cast them explicitly to int. Probably you are using old >release of SG where PoseVisualization.draw_poses hasn't been updated to do this casting for you.
I don't think this is the case. I exported an .onnx model using your tutorial and then did inference with ONNXRuntime and OpenVino. Their output was identical. When doing inference using the TensorRT engine I created using the exported .onnx model, I get something totally different. I would've expected to be similar.
Anyway, we are going to make an official Colab demo showing how to run the our models using TRT. Stay tuned for that.
That would be great! Please don't just show running the inference with the TRT engine, but also use PoseVisualization.draw_poses
to draw the pose. I ask for this, because I'm able to execute it also, but the output is just wrong and PoseVisualization.draw_poses
fails for that very reason.
💡 Your Question
I've successfully exported Yolo-Pose-NAS-N to .onnx model and then built a .trt engine. The .onnx was exported with FP32 quantization. I'm now trying to do inference of an image with the .trt engine. The code I'm testing is below, which is oriented to the code from issue #1451. After I do the inference, I pass the predictions to
PoseVisualization.draw_poses
. When I pass the predictions using another backend like OpenVino,PoseVisualization.draw_poses
just draws the predictions. However, when I use the TensorRT backend, this method fails. The error is also below. Also when I print thenum_predictions
from the inference I get[[1056964608]]
, which is obviously wrong. Why does the inference return such wrong values?The export code
Converting .onnx to .trt
When I convert to a TensorRT engine I get:
I thought when I export my model, the weights would be in FP32 format? I am confused.
The inference code:
Versions
Collecting environment information... PyTorch version: 2.1.1+cu118 Is debug build: False CUDA used to build PyTorch: 11.8 ROCM used to build PyTorch: N/A
OS: Microsoft Windows 10 Enterprise GCC version: (x86_64-posix-seh, Built by strawberryperl.com project) 8.3.0 Clang version: Could not collect CMake version: version 3.25.1 Libc version: N/A
Python version: 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)] (64-bit runtime) Python platform: Windows-10-10.0.19045-SP0 Is CUDA available: True CUDA runtime version: 11.8.89 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU 0: Quadro P2000 Nvidia driver version: 537.42 cuDNN version: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin\cudnn_ops_train64_8.dll HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True
CPU: Architecture=9 CurrentClockSpeed=2592 DeviceID=CPU0 Family=198 L2CacheSize=1536 L2CacheSpeed= Manufacturer=GenuineIntel MaxClockSpeed=2592 Name=Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz ProcessorType=3 Revision=
Versions of relevant libraries: [pip3] numpy==1.23.0 [pip3] onnx==1.13.0 [pip3] onnx-graphsurgeon==0.3.27 [pip3] onnx-simplifier==0.4.35 [pip3] onnxruntime==1.13.1 [pip3] torch==2.1.1+cu118 [pip3] torchaudio==2.1.1+cu118 [pip3] torchmetrics==0.8.0 [pip3] torchvision==0.16.1+cu118 [conda] Could not collect