fcdl94 / CoMFormer

Official implementation of "CoMFormer: Continual Learning in Semantic and Panoptic Segmentation"
https://arxiv.org/abs/2211.13999
Other
36 stars 3 forks source link

How to optimize the CoMFormer on the base classes? #1

Closed DongSky closed 1 year ago

DongSky commented 1 year ago

Hi Fabio. Sincerely thanks for publishing the training code. I have a question about this code. According to training scripts, I could complete the training steps of newly added classes. However, I'm not sure which is the training script of base classes. So could you provide a more detailed readme about the whole training procedure?

Best Regards Bowen

Besides, I noticed that there exist some comments about ``offline training'', but I think this script means the standard training on all classes.

DongSky commented 1 year ago

Solved.

YananGu commented 1 year ago

Hi,@DongSky, I have the same problem as you, can you tell me your solution? Thanks

fcdl94 commented 1 year ago

Hello and sorry for the missing answer. To train the base classes, it is enough to set CONT.TASK 0 in the config file (or in the scripts).

Please, let me know if you are able to replicate my results with it. Thank you.

YananGu commented 1 year ago

Hi, @fcdl94, can you give me an example for the base classes training, I train the base class by

but I got the following errors:

-- Process 1 terminated with the following error: Traceback (most recent call last): File "/mnt/project-ext/yy/anaconda3/envs/mask2former/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap fn(i, args) File "/mnt/project-ext/yy/detectron2/detectron2/engine/launch.py", line 123, in _distributed_worker main_func(args) File "/mnt/project-ext/yy/CoMFormer/train_inc.py", line 721, in main ret = trainer.train() File "/mnt/project-ext/yy/CoMFormer/train_inc.py", line 219, in train super().train(self.start_iter, self.max_iter) File "/mnt/project-ext/yy/detectron2/detectron2/engine/train_loop.py", line 155, in train self.run_step() File "/mnt/project-ext/yy/CoMFormer/train_inc.py", line 230, in run_step self._trainer.run_step() File "/mnt/project-ext/yy/detectron2/detectron2/engine/train_loop.py", line 322, in run_step losses.backward() AttributeError: 'int' object has no attribute 'backward' *

fcdl94 commented 1 year ago

Unfortunately, you need to configure all the parameters. I'll provide an edited version of scripts/ade.sh.

#!/bin/bash

cfg_file=configs/ade20k/semantic-segmentation/maskformer2_R101_bs16_90k.yaml
base=ade_ss

cont_args="CONT.BASE_CLS 100 CONT.INC_CLS 50 CONT.MODE overlap SEED 42"
task=mya_100-50-ov

name=MxF
meth_args="MODEL.MASK_FORMER.TEST.MASK_BG False MODEL.MASK_FORMER.PER_PIXEL False MODEL.MASK_FORMER.SOFTMASK True MODEL.MASK_FORMER.FOCAL True"

comm_args="OUTPUT_DIR ${base} ${meth_args} ${cont_args}"
inc_args="CONT.TASK 0"

## Train base classes
python train_inc.py --num-gpus 4 --config-file ${cfg_file} ${comm_args} ${inc_args} NAME ${name}

## Train step 1
inc_args="CONT.TASK 1 CONT.WEIGHTS ${base}/${task}/${name}/step0/model_final.pth SOLVER.MAX_ITER 20000 SOLVER.BASE_LR 0.00005"
python train_inc.py --num-gpus 4 --config-file ${cfg_file} ${comm_args} ${inc_args} NAME ${name}_PSEUDO_T2_UKD1Rew CONT.DIST.PSEUDO True CONT.DIST.KD_WEIGHT 0.5 CONT.DIST.UKD True CONT.DIST.KD_REW True