LARC-CMU-SMU / FoodSeg103-Benchmark-v1

MM'21 Main-Track paper
Apache License 2.0
105 stars 32 forks source link

TypeError: __init__() got an unexpected keyword argument 'model_name' when trying to train #7

Open kanesoban opened 3 years ago

kanesoban commented 3 years ago

Dear Foodseg Teams,

thank you for providing the resources for this great food segmentation tool. I have a question. I am trying out the training on my system. However, I ran into a problem and would like to ask for assistance.

This is the exact command I am using: CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 --master_port=12901 tools/train.py --config configs/foodnet/SETR_Naive_768x768_80k_base.py --work-dir checkpoints_dir/SETR_Naive --launcher pytorch

This is what I get:

/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmseg/models/builder.py:42: UserWarning: train_cfg and test_cfg is deprecated, please specify them in model 'please specify them in model', UserWarning) Traceback (most recent call last): File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/utils/registry.py", line 51, in build_from_cfg return obj_cls(**args) TypeError: init() got an unexpected keyword argument 'model_name'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/utils/registry.py", line 51, in build_from_cfg return obj_cls(*args) File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmseg/models/segmentors/encoder_decoder.py", line 35, in init self.backbone = builder.build_backbone(backbone) File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmseg/models/builder.py", line 19, in build_backbone return BACKBONES.build(cfg) File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/utils/registry.py", line 210, in build return self.build_func(args, kwargs, registry=self) File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/cnn/builder.py", line 26, in build_model_from_cfg return build_from_cfg(cfg, registry, default_args) File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/utils/registry.py", line 54, in build_from_cfg raise type(e)(f'{obj_cls.name}: {e}') TypeError: VisionTransformer: init() got an unexpected keyword argument 'model_name'**

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "tools/train.py", line 167, in main() File "tools/train.py", line 136, in main test_cfg=cfg.get('test_cfg')) File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmseg/models/builder.py", line 48, in build_segmentor cfg, default_args=dict(train_cfg=train_cfg, test_cfg=test_cfg)) File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/utils/registry.py", line 210, in build return self.build_func(*args, **kwargs, registry=self) File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/cnn/builder.py", line 26, in build_model_from_cfg return build_from_cfg(cfg, registry, default_args) File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/utils/registry.py", line 54, in build_from_cfg raise type(e)(f'{obj_cls.name}: {e}') TypeError: EncoderDecoder: VisionTransformer: init() got an unexpected keyword argument 'model_name' Traceback (most recent call last): File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/torch/distributed/launch.py", line 261, in main() File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/torch/distributed/launch.py", line 257, in main cmd=cmd) subprocess.CalledProcessError: Command '['/home/cszsolnai/anaconda3/envs/open-mmlab2/bin/python', '-u', 'tools/train.py', '--local_rank=0', '--config', 'configs/foodnet/SETR_Naive_768x768_80k_base.py', '--work-dir', 'checkpoints_dir/SETR_Naive', '--launcher', 'pytorch']' returned non-zero exit status 1.

It looks maybe like some of the packages have incompatible versions? I am not sure.

Here are details of my platform:

Python: 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0] CUDA available: True GPU 0: NVIDIA GeForce GTX 1080 CUDA_HOME: /usr/local/cuda-10.1 NVCC: Cuda compilation tools, release 10.1, V10.1.243 GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.6.0 PyTorch compiling details: PyTorch built with:

Package versions:

mmcv-full=1.3.10 cudatoolkit=10.1

XiongweiWu commented 3 years ago

@kanesoban Sorry for replying late. The mmcv-full version I use is 1.2.6, and it will raise error with 1.3.10 in my side. This is because the official repo of mmcv has been updated, and I will refine the instruction. BTW, can u check whether you have successfully installed software by running:

from mmseg.apis import inference_segmentor, init_segmentor

and install all dependencies in requirements.txt? Since there is no issue to run your example script in my side, I suspect the package version is the issue. The version I use is:

pytorch: 1.6.0 mmcv-full: 1.2.6 cudatoolkit: 10.2