open-mmlab / mmsegmentation

OpenMMLab Semantic Segmentation Toolbox and Benchmark.
https://mmsegmentation.readthedocs.io/en/main/
Apache License 2.0
7.69k stars 2.53k forks source link

Weight of mmpretrain used for backbone is error #3665

Open Winnie202 opened 1 month ago

Winnie202 commented 1 month ago

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. The bug has not been fixed in the latest version.

Describe the bug I trained MAE using mmpretrain and then loaded the weights as backbone and got an error 2024-05-10 13-41-25屏幕截图

Reproduction

  1. What command or script did you run? python tools/train.py configs/mae/mae-base_upernet_8xb2-amp-160k.py

  2. Did you make any modifications on the code or config? Did you understand what you have modified? backbone=dict( _delete=True,
    type='mmpretrain.MAE', arch='base',
    patch_size=16, in_chans=3, embed_dim=768, decoder_embed_dim=512, decoder_depth=8, decoder_num_heads=16, mlp_ratio=4., init_values=1.0, drop_path_rate=0.1, out_indices=[3, 5, 7, 11], #修改 out_indices init_cfg=dict( type='Pretrained', checkpoint=pretrained, prefix='backbone.')),

  3. What dataset did you use? ADE20K Environment

  4. Please run python mmseg/utils/collect_env.py to collect necessary environment information and paste it here. sys.platform: linux Python: 3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0] CUDA available: True numpy_random_seed: 2147483648 GPU 0: NVIDIA TITAN X (Pascal) CUDA_HOME: /usr/local/cuda-10.2 NVCC: Cuda compilation tools, release 10.2, V10.2.8 GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.12.1+cu102 PyTorch compiling details: PyTorch built with:

    • GCC 7.3
    • C++ Version: 201402
    • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
    • Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
    • OpenMP 201511 (a.k.a. OpenMP 4.5)
    • LAPACK is enabled (usually provided by MKL)
    • NNPACK is enabled
    • CPU capability usage: AVX2
    • CUDA Runtime 10.2
    • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70
    • CuDNN 7.6.5
    • Magma 2.5.2
    • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.13.1+cu102 OpenCV: 4.8.1 MMEngine: 0.8.4 MMSegmentation: 1.0.0+e64548f

  1. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch [e.g., pip, conda, source]
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback

If applicable, paste the error trackback here.

A placeholder for trackback.

Bug fix

If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!