arthurdouillard / CVPR2021_PLOP

Official code of CVPR 2021's PLOP: Learning without Forgetting for Continual Semantic Segmentation
https://arxiv.org/abs/2011.11390
MIT License
140 stars 23 forks source link

Hello, does this model support training on a single GPU #16

Closed Before-dawn-1 closed 2 years ago

Before-dawn-1 commented 2 years ago

Describe the bug A clear and concise description of what the bug is. ERROR:torch.distributed.elastic.multiprocessing.api:failed To Reproduce

Dataset: VOC2012 Setting: ... Command used or script used: I tried two methods:

  1. python -m torch.distributed.launch --nproc_per_node=1 run.py --data_root /home/before_dawn/Code/Dataset/VOCtrainval_11-May-2012/VOCdevkit/VOC2012 --batch_size 12 --dataset voc --name PLOP --task 15-5s --overlap --step 1 --lr 0.001 --epochs 30 --method FT --pod local --pod_factor 0.01 --pod_logits --pseudo entropy --threshold 0.001 --classif_adaptive_factor --init_balanced --pod_options "{\"switch\": {\"after\": {\"extra_channels\": \"sum\", \"factor\": 0.0005, \"type\": \"local\"}}}"
  2. python run.py --data_root /home/before_dawn/Code/Dataset/VOCtrainval_11-May-2012/VOCdevkit/VOC2012 --batch_size 12 --dataset voc --name PLOP --task 15-5s --overlap --step 1 --lr 0.001 --epochs 30 --method FT --pod local --pod_factor 0.01 --pod_logits --pseudo entropy --threshold 0.001 --classif_adaptive_factor --init_balanced --pod_options "{\"switch\": {\"after\": {\"extra_channels\": \"sum\", \"factor\": 0.0005, \"type\": \"local\"}}}" Expected behavior A clear and concise description of what you expected to happen.

Additional context Add any other context about the problem here. Does this model support training on a single GPU.

arthurdouillard commented 2 years ago

It does work on a single GPU. Look at the provided script like https://github.com/arthurdouillard/CVPR2021_PLOP/blob/main/scripts/voc/plop_15-1.sh, and set GPU=0 and NB_GPU=1