Converted to TorchScript model outputs mask with wrong shape

Describe the bug Converted to torchscript model outputs masks of size (128, 128) instead of (512, 512) Reproduction

What command or script did you run?

python tools/deployment/pytorch2torchscript.py PATH_TO_CONFIG --checkpoint PATH_TO_CHECKPOINT --output-file PATH_TO_OUT --shape 512 --verify

Did you make any modifications on the code or config? Did you understand what you have modified? Model config:

  model = dict(
      type='CascadeEncoderDecoder',
      data_preprocessor=dict(
          type='SegDataPreProcessor',
          mean=[123.675, 116.28, 103.53],
          std=[58.395, 57.12, 57.375],
          bgr_to_rgb=True,
          pad_val=0,
          seg_pad_val=255,
          size=(512, 512)),
      num_stages=2,
      pretrained='open-mmlab://msra/hrnetv2_w18',
      backbone=dict(
          type='HRNet',
          norm_cfg=dict(type='SyncBN', requires_grad=True),
          norm_eval=False,
          extra=dict(
              stage1=dict(
                  num_modules=1,
                  num_branches=1,
                  block='BOTTLENECK',
                  num_blocks=(4, ),
                  num_channels=(64, )),
              stage2=dict(
                  num_modules=1,
                  num_branches=2,
                  block='BASIC',
                  num_blocks=(4, 4),
                  num_channels=(18, 36)),
              stage3=dict(
                  num_modules=4,
                  num_branches=3,
                  block='BASIC',
                  num_blocks=(4, 4, 4),
                  num_channels=(18, 36, 72)),
              stage4=dict(
                  num_modules=3,
                  num_branches=4,
                  block='BASIC',
                  num_blocks=(4, 4, 4, 4),
                  num_channels=(18, 36, 72, 144)))),
      decode_head=[
          dict(
              type='FCNHead',
              in_channels=[18, 36, 72, 144],
              channels=270,
              in_index=(0, 1, 2, 3),
              input_transform='resize_concat',
              kernel_size=1,
              num_convs=1,
              concat_input=False,
              dropout_ratio=-1,
              num_classes=13,
              norm_cfg=dict(type='SyncBN', requires_grad=True),
              align_corners=False,
              loss_decode=dict(
                  type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)),
          dict(
              type='OCRHead',
              in_channels=[18, 36, 72, 144],
              in_index=(0, 1, 2, 3),
              input_transform='resize_concat',
              channels=512,
              ocr_channels=256,
              dropout_ratio=-1,
              num_classes=13,
              norm_cfg=dict(type='SyncBN', requires_grad=True),
              align_corners=False,
              loss_decode=dict(
                  type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0))
      ],
      train_cfg=dict(),
      test_cfg=None)

What dataset did you use? I trained on a custom dataset (In mmseg < 1.0 everything including converting to torchscript worked fine)

Environment

sys.platform: linux Python: 3.8.10 (default, Mar 13 2023, 10:26:41) [GCC 9.4.0] CUDA available: True numpy_random_seed: 2147483648 GPU 0: NVIDIA GeForce GTX 1650 CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.3, V11.3.58 GCC: x86_64-linux-gnu-gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 PyTorch: 1.10.0+cu113 PyTorch compiling details: PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.3
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
CuDNN 8.2
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.11.0+cu113 OpenCV: 4.7.0 MMEngine: 0.7.2 MMSegmentation: 1.0.0+098c306

open-mmlab / mmsegmentation

Converted to TorchScript model outputs mask with wrong shape #2924