zhangzjn / ADer

ADer (https://arxiv.org/abs/2406.03262) is an open source visual anomaly detection toolbox based on PyTorch, which supports multiple popular AD datasets and approaches.
https://arxiv.org/abs/2406.03262
179 stars 13 forks source link

AttributeError: 'DistributedDataParallel' object has no attribute 'destseg' #21

Closed XiaobinWu1998 closed 2 months ago

XiaobinWu1998 commented 2 months ago

there is an error when i run python -m torch.distributed.launch --nproc_per_node=$nproc_per_node --nnodes=$nnodes --node_rank=$node_rank --master_port=$master_port --use_env run.py -c configs/benchmark/destseg/destseg_256_300e.py -m train with multiple GPUs.

Traceback (most recent call last):
File "run.py", line 31, in
main()
File "run.py", line 26, in main
trainer = get_trainer(cfg)
File "/workspace/mycode/04-anomaly-detection/ADer/trainer/init.py", line 13, in get_trainer
return TRAINER.get_module(cfg.trainer.name)(cfg)
File "/workspace/mycode/04-anomaly-detection/ADer/trainer/destseg_trainer.py", line 41, in init
self.optim.de_st = get_optim(cfg.optim.de_st.kwargs, self.net.destseg.student_net,
File "/workspace/mysoftware/miniconda3/envs/ader/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1695, in getattr
raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'")
AttributeError: 'DistributedDataParallel' object has no attribute 'destseg'

so, what should i do?

zhangzjn commented 2 months ago

Considering that the current AD models typically require only a few hours for single-GPU training, there is no strong demand for DDP. Therefore, we classify DDP as optional support. Since we have adapted the original repository's models to the ADer framework, some methods may only be partially effective due to certain operators. In the future, we aim to support all methods comprehensively.