arthurdouillard / CVPR2021_PLOP

Official code of CVPR 2021's PLOP: Learning without Forgetting for Continual Semantic Segmentation
https://arxiv.org/abs/2011.11390
MIT License
145 stars 23 forks source link

ade 100-10 reproduce #35

Open zhaoedf opened 2 years ago

zhaoedf commented 2 years ago

i used the script you gave and i only got 24.xx final mIoU, i don't know if it is related to specific seed?

by the way, due to the limitation of gpu memory, i used 4 gpus with 4 batch each gpu, which was equal to total batch size of 24, and the bn is iabc_sync, i thought that 4 gpus is not a issue?

any idea how i can reproduce your ade 100-10 results?

zhaoedf commented 2 years ago

any ideas? besides, i can't reproduce methods like ewc, PI and RWalk and i noticed their weight importance seemed to be large in the argparser.py

arthurdouillard commented 2 years ago

Hey,

Sorry I cannot respond ealier, I'm a busy right now with the writing of my thesis manuscript.

  1. I'm not sure about your results as when I gave you the script I got the results from the paper. https://arxiv.org/pdf/2203.05402.pdf this paper seems to have also reproduced my results (and even beat them). Normally the number of GPUs shouldn't affect the results, but in some situations it seems it does... Btw, if you have 4 images per GPU, and 4 GPUs, it's a batch size of 16, not 24 as me.

  2. I haven't changed the hyperparameters of EWC/PI/RWalk from Cermelli's code. But note that I didn't include them in my ADE exp: Cermelli in MiB is evaluating ADE on three random class orders while I did only one class order because of time limitations.

zhaoedf commented 2 years ago

Hey,

Sorry I cannot respond ealier, I'm a busy right now with the writing of my thesis manuscript.

  1. I'm not sure about your results as when I gave you the script I got the results from the paper. https://arxiv.org/pdf/2203.05402.pdf this paper seems to have also reproduced my results (and even beat them). Normally the number of GPUs shouldn't affect the results, but in some situations it seems it does... Btw, if you have 4 images per GPU, and 4 GPUs, it's a batch size of 16, not 24 as me.
  2. I haven't changed the hyperparameters of EWC/PI/RWalk from Cermelli's code. But note that I didn't include them in my ADE exp: Cermelli in MiB is evaluating ADE on three random class orders while I did only one class order because of time limitations.
  1. sorry, it is 6 batchs actually, so the total batchs are still 24.
  2. oh, i mean EWC/PI/RWalk on PASCAL, i got extreme low results compared to the paper.

so uh, can i say that at least you use the scripts and hyperparameters in the repo reproduce you paper results? if that so, i will find problems of my own then. thx a lot!