RuntimeError: Error(s) in loading state_dict for DualResNet: ...

Backgrounds

Faster SOD model required

10 FPS on CPU
30 FPS on 1080Ti GPU

No public detector found that satisfies the requirements
Recall the 1904.09569 (arxiv.org)
SOTA architectures for SOD follow pixel-wise prediction
Actually, most of them were proposed for segmentation tasks
U-Net is one of them which showed impressive results on segmentation tasks
And PoolNet employed it as a backbone
Maybe we can use real-time segmentation models for SOD
Refer to #6 from swoook/ucnet (github) for more details

Issue description

Confirmed the error messages below when trying to execute \${REPO_ROOT}/main.py

RuntimeError: Error(s) in loading state_dict for DualResNet:

Code example

configuration in launch.json:

        {
            "name": "Python: Current File",
            "type": "python",
            "request": "launch",
            "program": "/data/swook/miniconda3/envs/torch18csnet/lib/python3.8/site-packages/torch/distributed/launch.py", //"${file}",
            "console": "integratedTerminal",
            "args": [
                "--nproc_per_node", "1",
                "${workspaceRoot}/main.py",
                "--mode", "train",
                "--cfg_path", "./experiments/duts/ddrnet23_slim_poolnet_train_scheme.yaml",
                ]
        },

Error messages and stack traces:

Traceback (most recent call last):
  File "/home/swook/.vscode-server/extensions/ms-python.python-2021.6.944021595/pythonFiles/lib/python/debugpy/_vendored/pydevd/pydevd.py", line 3293, in <module>
    main()
  File "/home/swook/.vscode-server/extensions/ms-python.python-2021.6.944021595/pythonFiles/lib/python/debugpy/_vendored/pydevd/pydevd.py", line 3286, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/home/swook/.vscode-server/extensions/ms-python.python-2021.6.944021595/pythonFiles/lib/python/debugpy/_vendored/pydevd/pydevd.py", line 2360, in run
    return self._exec(is_module, entry_point_fn, module_name, file, globals, locals)
  File "/home/swook/.vscode-server/extensions/ms-python.python-2021.6.944021595/pythonFiles/lib/python/debugpy/_vendored/pydevd/pydevd.py", line 2367, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/home/swook/.vscode-server/extensions/ms-python.python-2021.6.944021595/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydev_imps/_pydev_execfile.py", line 25, in execfile
    exec(compile(contents + "\n", file, 'exec'), glob, loc)
  File "/data/swook/repos/chenjun2hao/ddrnet/main.py", line 115, in <module>
    main(args)
  File "/data/swook/repos/chenjun2hao/ddrnet/main.py", line 98, in main
    solver = Solver(train_loader, None, config, args)
  File "/data/swook/repos/chenjun2hao/ddrnet/sod/solver.py", line 26, in __init__
    self.build_model()
  File "/data/swook/repos/chenjun2hao/ddrnet/sod/solver.py", line 74, in build_model
    self.net.load_state_dict(model_dict, strict=False)
  File "/data/swook/miniconda3/envs/torch18csnet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for DualResNet:
        size mismatch for seghead_extra.conv2.weight: copying a param with shape torch.Size([19, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 64, 1, 1]).
        size mismatch for seghead_extra.conv2.bias: copying a param with shape torch.Size([19]) from checkpoint, the shape in current model is torch.Size([1]).
        size mismatch for final_layer.conv2.weight: copying a param with shape torch.Size([19, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([1, 64, 1, 1]).
        size mismatch for final_layer.conv2.bias: copying a param with shape torch.Size([19]) from checkpoint, the shape in current model is torch.Size([1]).
Killing subprocess 53003
Traceback (most recent call last):
  File "/data/swook/miniconda3/envs/torch18csnet/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/data/swook/miniconda3/envs/torch18csnet/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/swook/.vscode-server/extensions/ms-python.python-2021.6.944021595/pythonFiles/lib/python/debugpy/__main__.py", line 45, in <module>
    cli.main()
  File "/home/swook/.vscode-server/extensions/ms-python.python-2021.6.944021595/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main
    run()
  File "/home/swook/.vscode-server/extensions/ms-python.python-2021.6.944021595/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file
    runpy.run_path(target_as_str, run_name=compat.force_str("__main__"))
  File "/data/swook/miniconda3/envs/torch18csnet/lib/python3.8/runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/data/swook/miniconda3/envs/torch18csnet/lib/python3.8/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/data/swook/miniconda3/envs/torch18csnet/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/data/swook/miniconda3/envs/torch18csnet/lib/python3.8/site-packages/torch/distributed/launch.py", line 340, in <module>
    main()
  File "/data/swook/miniconda3/envs/torch18csnet/lib/python3.8/site-packages/torch/distributed/launch.py", line 326, in main
    sigkill_handler(signal.SIGTERM, None)  # not coming back
  File "/data/swook/miniconda3/envs/torch18csnet/lib/python3.8/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler
    raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
subprocess.CalledProcessError: Command '['/data/swook/miniconda3/envs/torch18csnet/bin/python', '-u', '/data/swook/repos/chenjun2hao/ddrnet/main.py', '--local_rank=0', '--mode', 'train', '--cfg_path', './experiments/duts/ddrnet23_slim_poolnet_train_scheme.yaml']' returned non-zero exit status 1.

System Info

PyTorch version: 1.8.1
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A

OS: Ubuntu 16.04.3 LTS (x86_64)
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
Clang version: Could not collect
CMake version: version 3.5.1

Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: 
GPU 0: GeForce RTX 2080 Ti
GPU 1: GeForce RTX 2080 Ti

Nvidia driver version: 440.33.01
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.20.2
[pip3] torch==1.8.1
[pip3] torchvision==0.9.1
[conda] blas                      1.0                         mkl  
[conda] cudatoolkit               10.2.89              hfd86e86_1  
[conda] ffmpeg                    4.3                  hf484d3e_0    pytorch
[conda] mkl                       2021.2.0           h06a4308_296  
[conda] mkl-service               2.3.0            py38h27cfd23_1  
[conda] mkl_fft                   1.3.0            py38h42c9631_2  
[conda] mkl_random                1.2.1            py38ha9443f7_2  
[conda] numpy                     1.20.2           py38h2d18471_0  
[conda] numpy-base                1.20.2           py38hfae3a4d_0  
[conda] pytorch                   1.8.1           py3.8_cuda10.2_cudnn7.6.5_0    pytorch
[conda] torchvision               0.9.1                py38_cu102    pytorch

swoook / ddrnet

RuntimeError: Error(s) in loading state_dict for DualResNet: ... #6

Backgrounds

Issue description

Code example

System Info