About the avg accuracy on CIFAR-100 in Table 2

yuzhang0403 commented 3 months ago

Hello, Thank you for your amazing work and code. I use the setting of SEED for 10 and 5 steps, but I got a significant lower results than those documented in the paper.

I modify the --num-tasks and --nc-first-task to run the experiment about T=6 (|C1|=50) and T=11 (|C1|=50) separately:

python src/main_incremental.py --approach seed --gmms 1 --max-experts 5 --use-multivariate --nepochs 200 --tau 3 --batch-size 128 --num-workers 4 --datasets cifar100_icarl --num-tasks 6 --nc-first-task 50 --lr 0.05 --weight-decay 5e-4 --clipping 1 --alpha 0.99 --use-test-as-val --network resnet32 --extra-aug fetril --momentum 0.9 --exp-name exp_50+5x10 --seed 0

python src/main_incremental.py --approach seed --gmms 1 --max-experts 5 --use-multivariate --nepochs 200 --tau 3 --batch-size 128 --num-workers 4 --datasets cifar100_icarl --num-tasks 11 --nc-first-task 50 --lr 0.05 --weight-decay 5e-4 --clipping 1 --alpha 0.99 --use-test-as-val --network resnet32 --extra-aug fetril --momentum 0.9 --exp-name exp_50+10x5 --seed 0

I get avg_acc = 67.2 on T=6 (|C1|=50) and avg_acc = 66.6 on T=11 (|C1|=50), which is more lower than the results in the paper. Could you please elaborate more details on what method you used in the paper? And how I can reproduce the results.

Thanks.

grypesc commented 3 months ago

Hello, thanks for contacting. For table 2. we utilized ResNet18 instead of ResNet32. Please, replace --network resnet32 with resnet18 and upscale images (use --datasets cifar100_icarl_224). Let me know if that helps!

yuzhang0403 commented 3 months ago

Thank you so much for your reply and help! I run the experiment as you said, but still get a lower results where avg_acc = 68.7 on T=6 (|C1|=50) and avg_acc = 62.8 on T=11 (|C1|=50). The command is showed as follow:

python src/main_incremental.py --approach seed --gmms 1 --max-experts 5 --use-multivariate --nepochs 200 --tau 3 --batch-size 128 --num-workers 4 --datasets cifar100_icarl_224 --num-tasks 6 --nc-first-task 50 --lr 0.05 --weight-decay 5e-4 --clipping 1 --alpha 0.99 --use-test-as-val --network resnet18 --extra-aug fetril --momentum 0.9 --exp-name exp_cifar50+5x10 --seed 0

python src/main_incremental.py --approach seed --gmms 1 --max-experts 5 --use-multivariate --nepochs 200 --tau 3 --batch-size 128 --num-workers 4 --datasets cifar100_icarl_224 --num-tasks 11 --nc-first-task 50 --lr 0.05 --weight-decay 5e-4 --clipping 1 --alpha 0.99 --use-test-as-val --network resnet18 --extra-aug fetril --momentum 0.9 --exp-name exp_cifar50+10x5 --seed 0

Is there anything else that needs to be changed? Like --lr or something else? I look forward to your reply.

grypesc commented 3 months ago

Hey, In the new commit I have added the script for Table 2 called table2.sh. You need to set weight decay to 1e-4 and ftepochs to 0. With this change I have managed to reproduce results. I hope that resolves your problem. Big first task is tricky for SEED because the first expert is much better than others. Top hyperparameters for search in SEED is weight decay and tau. Good luck!

yuzhang0403 commented 3 months ago

Thank you very much for your detailed help. Best wishes to you!

yuzhang0403 commented 3 months ago

I'm sorry to bother you again. I noticed that the calculation of Avg_acc in your code does not apply to cases with a base stage, as it is not weighted according to the number of samples included in each task. After amending the calculation of Avg_acc, the solution you gave still fails to achieve the results documented in the paper where I get avg_acc = 69.2 on T=6 (|C1|=50), avg_acc = 68.8 on T=11 (|C1|=50) and avg_acc = 60.9 on T=21 (|C1|=40) respectively.

Could you please provide full hyperparameters on these settings or model weights for the results documented in the paper?

yuzhang0403 commented 3 months ago

Thanks to your help, I reproduced the results of Table 2 by adjusting weight decay, tau, lr.

grypesc / SEED

About the avg accuracy on CIFAR-100 in Table 2 #4