SwinTransformer / Swin-Transformer-Object-Detection

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.
https://arxiv.org/abs/2103.14030
Apache License 2.0
1.82k stars 379 forks source link

AttributeError: module 'torch.distributed' has no attribute '_all_gather_base' #201

Open xc012 opened 1 year ago

xc012 commented 1 year ago

Problem: The error “apex is not installed” during training, but it has been tried to install in various ways, including direct "pip install apex" and download source code and compilation.

Command: python tools/train.py configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py

Error: apex is not installed Traceback (most recent call last): File "tools/train.py", line 15, in from mmdet.apis import set_random_seed, train_detector File "/root/Swin-Transformer-Object-Detection-master/Swin-Transformer-Object-Detection/mmdet/apis/init.py", line 1, in from .inference import (async_inference_detector, inference_detector, File "/root/Swin-Transformer-Object-Detection-master/Swin-Transformer-Object-Detection/mmdet/apis/inference.py", line 11, in from mmdet.datasets import replace_ImageToTensor File "/root/Swin-Transformer-Object-Detection-master/Swin-Transformer-Object-Detection/mmdet/datasets/init.py", line 10, in from .utils import (NumClassCheckHook, get_loading_pipeline, File "/root/Swin-Transformer-Object-Detection-master/Swin-Transformer-Object-Detection/mmdet/datasets/utils.py", line 9, in from mmdet.models.dense_heads import GARPNHead, RPNHead File "/root/Swin-Transformer-Object-Detection-master/Swin-Transformer-Object-Detection/mmdet/models/init.py", line 1, in from .backbones import # noqa: F401,F403 File "/root/Swin-Transformer-Object-Detection-master/Swin-Transformer-Object-Detection/mmdet/models/backbones/init.py", line 13, in from .swin_transformer import SwinTransformer File "/root/Swin-Transformer-Object-Detection-master/Swin-Transformer-Object-Detection/mmdet/models/backbones/swin_transformer.py", line 13, in from timm.models.layers import DropPath, to_2tuple, truncnormal File "/root/.local/lib/python3.7/site-packages/timm/init.py", line 2, in from .models import create_model, list_models, is_model, list_modules, model_entrypoint, \ File "/root/.local/lib/python3.7/site-packages/timm/models/init.py", line 1, in from .beit import File "/root/.local/lib/python3.7/site-packages/timm/models/beit.py", line 49, in from timm.data import IMAGENET_DEFAULT_MEAN, IMAGENET_DEFAULT_STD File "/root/.local/lib/python3.7/site-packages/timm/data/init.py", line 5, in from .dataset import ImageDataset, IterableImageDataset, AugMixDataset File "/root/.local/lib/python3.7/site-packages/timm/data/dataset.py", line 12, in from .parsers import create_parser File "/root/.local/lib/python3.7/site-packages/timm/data/parsers/init.py", line 1, in from .parser_factory import create_parser File "/root/.local/lib/python3.7/site-packages/timm/data/parsers/parser_factory.py", line 3, in from .parser_image_folder import ParserImageFolder File "/root/.local/lib/python3.7/site-packages/timm/data/parsers/parser_image_folder.py", line 11, in from timm.utils.misc import natural_key File "/root/.local/lib/python3.7/site-packages/timm/utils/init.py", line 4, in from .cuda import ApexScaler, NativeScaler File "/root/.local/lib/python3.7/site-packages/timm/utils/cuda.py", line 8, in from apex import amp File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 668, in _load_unlocked File "", line 638, in _load_backward_compatible File "/opt/conda/lib/python3.7/site-packages/apex-0.1-py3.7.egg/apex/init.py", line 27, in File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 668, in _load_unlocked File "", line 638, in _load_backward_compatible File "/opt/conda/lib/python3.7/site-packages/apex-0.1-py3.7.egg/apex/transformer/init.py", line 4, in File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 668, in _load_unlocked File "", line 638, in _load_backward_compatible File "/opt/conda/lib/python3.7/site-packages/apex-0.1-py3.7.egg/apex/transformer/pipeline_parallel/init.py", line 1, in

-- coding: utf-8 --

File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 668, in _load_unlocked File "", line 638, in _load_backward_compatible File "/opt/conda/lib/python3.7/site-packages/apex-0.1-py3.7.egg/apex/transformer/pipeline_parallel/schedules/init.py", line 3, in File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 668, in _load_unlocked File "", line 638, in _load_backward_compatible File "/opt/conda/lib/python3.7/site-packages/apex-0.1-py3.7.egg/apex/transformer/pipeline_parallel/schedules/fwd_bwd_no_pipelining.py", line 10, in File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 668, in _load_unlocked File "", line 638, in _load_backward_compatible File "/opt/conda/lib/python3.7/site-packages/apex-0.1-py3.7.egg/apex/transformer/pipeline_parallel/schedules/common.py", line 9, in File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 668, in _load_unlocked File "", line 638, in _load_backward_compatible File "/opt/conda/lib/python3.7/site-packages/apex-0.1-py3.7.egg/apex/transformer/pipeline_parallel/p2p_communication.py", line 25, in File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 668, in _load_unlocked File "", line 638, in _load_backward_compatible File "/opt/conda/lib/python3.7/site-packages/apex-0.1-py3.7.egg/apex/transformer/utils.py", line 11, in AttributeError: module 'torch.distributed' has no attribute '_all_gather_base'

Environment: Python 3.7.6 torch 1.7.1+cu110
torchaudio 0.7.2
torchvision 0.8.2+cu110
apex 0.1 mmcv-full 1.2.4
mmdet 2.11.0 /root/Swin-Transformer-Object-Detection-master/Swin-Transformer-Object-Detection

Installation method: Apex Installation method: git clone https://github.com/NVIDIA/apex cd apex pip install -v --disable-pip-version-check --no-cache-dir ./

MMCV Installation method: pip install mmcv-full==1.2.4 -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7/index.html

mmdet Installation method: git clone https://github.com/SwinTransformer/Swin-Transformer-Object-Detection.git cd Swin-Transformer-Object-Detection python setup.py develop

f2367976412 commented 1 year ago

I also encountered the same problem, please tell me if you solved it

JankinHou commented 1 year ago

我也遇到了同样的问题,请问最终解决了吗?