Open DaZhUUU opened 2 days ago
I tested with the ONNX model fragment, and indeed, the performance of bilinear is worse, approximately 200ms. The performance of nearest is approximately 75ms.
The tested model comes from the following code--"multi_scale_deformable_attn_pytorch" function https://github.com/facebookresearch/sapiens/blob/3e829ac27476e4a70b6a01f85e487492afe02df1/cv/mmcv/ops/multi_scale_deform_attn.py#L114
Obviously, the algorithm complexity of nearest is less than that of bilinear.
Obviously, the algorithm complexity of nearest is less than that of bilinear.
I know. But the gap is too large and this is abnormal
Can you upload the two subgraph onnxs ? ( grid_sample with bilinear + grid_sample with nearest )
Can you upload the two subgraph onnxs ? ( grid_sample with bilinear + grid_sample with nearest )
I don't know why my image uploads always fail. It could be that there are some issues with my internet connection.
The model just likes the code, some rehsape, some gather. The value_spatial_shapes is [[116, 200], [58, 100], [29, 50], [15, 25]]
Environment
TensorRT Version:8.6.2
NVIDIA GPU:Orin
NVIDIA Driver Version:
CUDA Version:12.2
CUDNN Version: 8904
Description
I have a onnx model. There are some gridsample operators in this model. I use /usr/src/tensorrt/bin/trtexec tool to build the model and test performance in Orin. The command like this:
Here is my problem: 1) When I set the attribute 'mode' of gridsample to 'bilinear', the latency is 1900+ms. (I know there must be something wrong) 2) When I set the attribute 'mode' of gridsample to 0, the latency is 680ms. It's a normal latency. I know I shouldn't set mode to 0, but I see in following code, if the 'mode' is not 'bilinear', 'nearest' or 'bicubic', 'interpolationMode' will be 'kNEAREST'.
https://github.com/onnx/onnx-tensorrt/blob/7583da4c62475e84b7be31f4b8fb0c101873d434/builtin_op_importers.cpp#L4386
### So why the latency in these two scenarios is significantly different?
And when I want to get the profiling of model with nsight-compute, an error occurs halfway through execution.
I download new version nsight-compute from next website like 2023.3.0 and 2024.3.2, some errors also occurr and it can't start profiling. https://developer.nvidia.com/tools-downloads#?dn=nsight-compute-2024-3-2