Results of iCaRL - Githubissues

qsunyuan commented 2 years ago

Hi!

Nice work on continual learning.

Recently I reproduced iCaRL, but I cannot achieve the original results. I tried many methods, adjust learning rate, more training epochs, different weight decay.....

And I found your results are quite similar to mine.

Could u pls give some insight about the original results.

In addition, in BiC, their results are about 50% (the 10th task top-1 test acc), this is very surprising, am I missing something?

Hope to get ur replay. This has bothered me for a long time.

From iCaRL https://arxiv.org/pdf/1611.07725.pdf

From yours https://arxiv.org/pdf/2010.15277.pdf

From BiC https://arxiv.org/pdf/1905.13260.pdf

qsunyuan commented 2 years ago

This question may similar to this link.

https://github.com/mmasana/FACIL/issues/6

mmasana commented 2 years ago

Hi @qsunyuan,

indeed, the comments from issue #6 are part of the explanation. As far as I remember, another reason is that when iCaRL came out there was not much discussion yet about how evaluation on the different outputs for each task should be done. Make sure to check if some of the results are given under the task-IL setting (when you have the task-ID at inference time). Also, I think in their original implementation, the heads were all created from the beginning and accessed by the optimizer, which technically is information that you should not know in advance. Further, hyperparameters and training regimes were chosen to obtain the best results on that setting at the end of all tasks. Quite a few of those details quickly pile up, and I've talked with some other people that have not managed to reproduce the results in the iCaRL paper for class-IL. I would look at iCaRL more to understand the ideas that they proposed and why, but at this point in time, almost every single incremental method has been able to surpass it when using more realistic continual learning scenarios. Finally, the results we provide in the survey (the figure that you posted above) is an average of 10 runs using the CHL (Continual Hyperparameter Framework), and thus it cannot be compared to the other graphs since it becomes more challenging.

Regarding BiC, I contacted the author some time ago when we released the code and he confirmed that it looked correct. If I remember correctly, we managed to reproduce their results when using the training parameters and regimes from their paper. However, when using the CHF (more realistic) their results go a bit down (as do for all methods).

Hope this helps!

qsunyuan commented 2 years ago

Thx for ur help, it really helps a lot.

Have a good day.

mmasana / FACIL

Results of iCaRL #12