[Bug] tensorrt output boxes have uniform horizontal offsets at certain input aspect ratios.

Checklist

[ ] I have searched related issues but cannot get the expected help.
[ ] 2. I have read the FAQ documentation but cannot get the expected help.
[ ] 3. The bug has not been fixed in the latest version.

Describe the bug

Hi I convert the tensorrt model using the command

python ../mmdeploy/tools/deploy.py \
    ../mmdeploy/configs/mmdet/detection/detection_tensorrt_dynamic-320x320-1344x1344.py \
    ../mmdetection/configs/dino/dino-5scale_swin-l_8xb2-36e_coco.py \
    /root/autodl-tmp/pretrained/dino-5scale_swin-l_8xb2-36e_coco-5486e051.pth \
    ../mmdetection/demo/demo.jpg \
    --work-dir mmdeploy_model/dino \
    --device cuda \
    --dump-info

Then I inference with tensorrt and find that input images at scales such as (1200, 800), (300, 200) (1125, 750) (1800, 1200) (1620, 1080) could get correct results. But if the scales are (1422, 800), (1333, 750) (1200, 750), the horizontal offsets will appear.

Reproduction

from mmdeploy_runtime import Detector
import cv2
import numpy as np

img = cv2.imread('/root/autodl-tmp/coco/val2017/000000000632.jpg')
img = cv2.resize(img, (1200, 800))  # 1200, 800   1920, 1080; 1422, 800    300, 200   1500, 1000   1800, 1200  1620, 1080

# create a detector
detector = Detector(model_path='mmdeploy_model/dino2/', device_name='cuda', device_id=0)
# run the inference
#import time
#t0 = time.time()
#for i in range(200):
#print(img.shape, np.sum(img[...,0]), np.sum(img[...,1]), np.sum(img[...,2]))  # (1080, 1920, 3) 93819078 86470701 98883550
bboxes, labels, _ = detector(img)
#t1 = time.time()
#print('avg time during 200 iters', (t1-t0)/200)
# Filter the result according to threshold
indices = [i for i in range(len(bboxes))]
for index, bbox, label_id in zip(indices, bboxes, labels):
  [left, top, right, bottom], score = bbox[0:4].astype(int),  bbox[4]
  if score < 0.3:
      continue
  cv2.rectangle(img, (left, top), (right, bottom), (0, 255, 0))

cv2.imwrite('dino_output.png', img)

Environment

12/14 13:14:34 - mmengine - INFO - 

12/14 13:14:34 - mmengine - INFO - **********Environmental information**********
12/14 13:14:35 - mmengine - INFO - sys.platform: linux
12/14 13:14:35 - mmengine - INFO - Python: 3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0]
12/14 13:14:35 - mmengine - INFO - CUDA available: True
12/14 13:14:35 - mmengine - INFO - numpy_random_seed: 2147483648
12/14 13:14:35 - mmengine - INFO - GPU 0: NVIDIA GeForce RTX 3090
12/14 13:14:35 - mmengine - INFO - CUDA_HOME: /usr/local/cuda
12/14 13:14:35 - mmengine - INFO - NVCC: Cuda compilation tools, release 11.6, V11.6.124
12/14 13:14:35 - mmengine - INFO - GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
12/14 13:14:35 - mmengine - INFO - PyTorch: 1.13.1+cu116
12/14 13:14:35 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.6
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.3.2  (built against CUDA 11.5)
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.6, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 

12/14 13:14:35 - mmengine - INFO - TorchVision: 0.14.1+cu116
12/14 13:14:35 - mmengine - INFO - OpenCV: 4.8.1
12/14 13:14:35 - mmengine - INFO - MMEngine: 0.10.1
12/14 13:14:35 - mmengine - INFO - MMCV: 2.0.1
12/14 13:14:35 - mmengine - INFO - MMCV Compiler: GCC 9.3
12/14 13:14:35 - mmengine - INFO - MMCV CUDA Compiler: 11.6
12/14 13:14:35 - mmengine - INFO - MMDeploy: 1.3.0+660af62
12/14 13:14:35 - mmengine - INFO - 

12/14 13:14:35 - mmengine - INFO - **********Backend information**********
12/14 13:14:35 - mmengine - INFO - tensorrt:    8.6.1
12/14 13:14:35 - mmengine - INFO - tensorrt custom ops: Available
12/14 13:14:35 - mmengine - INFO - ONNXRuntime: 1.8.1
12/14 13:14:35 - mmengine - INFO - ONNXRuntime-gpu:     None
12/14 13:14:35 - mmengine - INFO - ONNXRuntime custom ops:      Available
12/14 13:14:35 - mmengine - INFO - pplnn:       None
12/14 13:14:35 - mmengine - INFO - ncnn:        None
12/14 13:14:35 - mmengine - INFO - snpe:        None
12/14 13:14:35 - mmengine - INFO - openvino:    None
12/14 13:14:35 - mmengine - INFO - torchscript: 1.13.1+cu116
12/14 13:14:35 - mmengine - INFO - torchscript custom ops:      NotAvailable
12/14 13:14:35 - mmengine - INFO - rknn-toolkit:        None
12/14 13:14:35 - mmengine - INFO - rknn-toolkit2:       None
12/14 13:14:35 - mmengine - INFO - ascend:      None
12/14 13:14:35 - mmengine - INFO - coreml:      None
12/14 13:14:35 - mmengine - INFO - tvm: None
12/14 13:14:35 - mmengine - INFO - vacc:        None
12/14 13:14:35 - mmengine - INFO - 

12/14 13:14:35 - mmengine - INFO - **********Codebase information**********
12/14 13:14:35 - mmengine - INFO - mmdet:       3.0.0
12/14 13:14:35 - mmengine - INFO - mmseg:       None
12/14 13:14:35 - mmengine - INFO - mmpretrain:  None
12/14 13:14:35 - mmengine - INFO - mmocr:       None
12/14 13:14:35 - mmengine - INFO - mmagic:      None
12/14 13:14:35 - mmengine - INFO - mmdet3d:     None
12/14 13:14:35 - mmengine - INFO - mmpose:      None
12/14 13:14:35 - mmengine - INFO - mmrotate:    None
12/14 13:14:35 - mmengine - INFO - mmaction:    None
12/14 13:14:35 - mmengine - INFO - mmrazor:     None
12/14 13:14:35 - mmengine - INFO - mmyolo:      None

Error traceback

No response

open-mmlab / mmdeploy

[Bug] tensorrt output boxes have uniform horizontal offsets at certain input aspect ratios. #2602

Checklist

Describe the bug

Reproduction

Environment

Error traceback