GMvandeVen / class-incremental-learning

PyTorch implementation of a VAE-based generative classifier, as well as other class-incremental learning methods that do not store data (DGR, BI-R, EWC, SI, CWR, CWR+, AR1, the "labels trick", SLDA).
MIT License
71 stars 14 forks source link

Core50 result #1

Open jiaolifengmi opened 1 year ago

jiaolifengmi commented 1 year ago

flow<compare_all.py --experiment=CORe50 --n-seeds=10 --seed=11 --single-epochs --batch=1 --fc-layers=2 --z-dim=200 --fc-units=1024 --lr=0.0001 --c=10 --lambda=10 --omega-max=0.1 --ar1-c=1. --dg-prop=0. --bir-c=0.01 --si-dg-prop=0.6>

According to the code provided in the file, the result of running Core50 dataset is inconsistent with that in the article( BIR Table2)

GMvandeVen commented 1 year ago

Hi jiaolifengmi, thank you for the feedback. Could you provide a bit more details? Then I’ll look into it. For example, what are the results that you get when you run this comment?

jiaolifengmi commented 1 year ago

flow the command.sh <compare_all.py --experiment=CORe50 --n-seeds=10 --seed=11 --single-epochs --batch=1 --fc-layers=2 --z-dim=200 --fc-units=1024 --lr=0.0001 --c=10 --lambda=10 --omega-max=0.1 --ar1-c=1. --dg-prop=0. --bir-c=0.01 --si-dg-prop=0.6> the results of BIR is not 60.40(±1.04)(Results provided in Table 2 of the article). my results: the results of all seeds is [0.48647837 0.56542501 0.50891047 0.55057535 0.52782375 0.53963085 0.46688718 0.56751322 0.438259 0.52388038],The average value is 0.5175383566172511

jiaolifengmi commented 1 year ago

Hi GMvandeVen, I'm sorry to bother you again, but I still want to ask you what caused the inconsistent results displayed on the Core50 dataset? Is there a problem with my operation mode?

GMvandeVen commented 1 year ago

Hi jiaolifengmi, I'm sorry for the slow reply. I still need to look at this in more detail. I already checked whether perhaps I made a typo in the results table, but that doesn't seem to be the case. I'll let you know once I figured it out.

jiaolifengmi commented 1 year ago

Hi GMvandeVen, Thank you again for your reply and look forward to your results

GMvandeVen commented 1 year ago

EDIT: the issue described in the answer below is not what caused these inconsistencies. Please see a later answer (https://github.com/GMvandeVen/class-incremental-learning/issues/1#issuecomment-1324852995) for an explanation and fix of this issue.


Hi jiaolifengmi, sorry it took a while to figure it out, but I expect that the difference between your results for BI-R and the results reported in the paper is due to the use of a different version of a pre-trained ResNet18 to extract the CORe50 features. The pre-trained ResNet18 is selected in this line of code: https://github.com/GMvandeVen/class-incremental-learning/blob/21dd41d31dea1dfafb1e8d90d7f0a1be6b1c6e66/preprocess_core50.py#L217

With the pre-trained ResNet18 from the version of torchvision that I used for the experiments reported in the paper, I consistently get results similar to those reported in the paper. However, if I use the pre-trained ResNet18 from the latest version of torchvision, I indeed get results similar to those you mention. I have experienced before that the performance of generative replay of latent features is quite sensitive to feature extractor that is used (presumably because for some features it is easier to learn a generative model of than for others), although I am a bit surprised that the difference can be this much, especially as both models are pre-trained versions of ResNet18 on ImageNet. At least this variability in the performance of BI-R does not change the interpretation of the experiments reported in the paper.

Hope this helps!

jiaolifengmi commented 1 year ago

Hi GMvandeVen, Thank you again for your reply. I'm sorry to bother you again.I don't know whether you have noticed the performance of BIR on CIFAR100 dataset. I ran it according to the provided code, but I didn't get the results provided in the paper. Besides, the features of CIFAR100 are extracted from the provided pre-trained model, but the results are inconsistent. In addition, are there two wrong parameters in this line of code? Or is the setting reversed?(--bir-c/--si-dg-prob)

flow the command.sh <compare_all.py --experiment=CIFAR100 --pre-convE --hidden --iters=5000 --n-seeds=10 --seed=11 --c=1. --lambda=100. --omega-max=0.01 --ar1-c=100 --dg-prop=0.7 --bir-c=0.6 --si-dg-prop=100000000

the results of BIR is not 21.51 (± 0.25)(Results provided in Table 2 of the article). my results: the results of all seeds is [0.1342 0.172 0.1502 0.1511 0.1454 0.1565 0.1451 0.128 0.1045 0.0892],The average value is 0.13765

GMvandeVen commented 1 year ago

Yes, you are right, that is a mistake. The values of --bir-c and --si-dg-prop should be reversed. Sorry about that! I just changed it in the code. Does this fix it?

jiaolifengmi commented 1 year ago

However, these two parameters do not affect the BIR results on CIFAR100, but affect the BIR+SI results. Even if these two parameters are replaced, the results of BIR+SI on CIFAR100 are not consistent with those in the paper. Is this also because of the pre-trained model?

GMvandeVen commented 1 year ago

Sorry, I didn't read your entire question correctly. Hmm, no, if you use the provided pre-trained convolutional layers than those should be the same as I used, so that can't explain such a difference. I'm starting to think that perhaps there is a mistake in the code for BI-R, maybe it got introduced when I cleaned the code. I will try to investigate. Sorry about those issues!

GMvandeVen commented 1 year ago

In the mean time, for BI-R you could also use this repository: https://github.com/GMvandeVen/brain-inspired-replay I realize it might be inconvenient as there are some differences between these two repositories, but that code for BI-R should be working correctly. (If not, please let me know!)

jiaolifengmi commented 1 year ago

I'm sorry, maybe I didn't make it clear. The problem now is to run BIR with the provided code. Whether it is Core50 or CIFAR100, the results are inconsistent with those provided in the paper. Because of the inconsistency on the Core50 dataset, your explanation is that the version of the pre training model is inconsistent. But since the features of CIFAR100 are extracted from the model provided, it is strange that I also get inconsistent results with those provided in the paper. I hope you can verify the results of BIR algorithm on these two datasets again. Thank you again!

GMvandeVen commented 1 year ago

Based on the description of your results, I now expect it is possible there is a mistake in the code for BI-R in this repository. This mistake might indeed also explain the difference on the CORe50 dataset (it is indeed surprising that a different version of ImageNet pre-trained model would make such a difference). Using my own, uncleaned version of this code I get results similar to those reported in the paper, so perhaps a mistake got introduced when cleaning the code. I will try to investigate.

The same mistake would therefore likely not be in the implementation of BI-R in this repository: https://github.com/GMvandeVen/brain-inspired-replay, also as that repository has already been more thoroughly used and tested by others.

GMvandeVen commented 1 year ago

Hi jiaolifengmi, my apologies that it took so long, but I have finally found the error that caused the replay methods to run incorrectly when using the compare_all.py-script. The error was that in this script, all replay methods were erroneously combined with the method "CWR-plus", because the args.cwr_plus flag was never turned off. I have now fixed this by adding this line: https://github.com/GMvandeVen/class-incremental-learning/blob/cc51706a58ea8c671054e68fb4f0d08173603600/compare_all.py#L157

This should fix the inconsistencies you encountered. I did some quick checks and it seems to work fine for me now. Please let me know if you still encounter any issues.

Many thanks for raising this issue!

jiaolifengmi commented 1 year ago

Hi GMvandeVen, Sorry to bother you again. I ran the corrected code. On the CIFAR100 dataset, the BIR effect is consistent with that provided in the paper, but the BIR+SI result is not consistent with that provided in the paper. That is, after adding SI, the effect does not improve, and it is consistent with the BIR effect.The results are as follows: [0.2226 0.2664 0.2435 0.2601 0.2198 0.2104 0.1694 0.1983 0.1963 0.204 ] 0.21908000000000002

栗子 @.***

 

------------------ 原始邮件 ------------------ 发件人: "GMvandeVen/class-incremental-learning" @.>; 发送时间: 2022年11月23日(星期三) 晚上6:38 @.>; @.**@.>; 主题: Re: [GMvandeVen/class-incremental-learning] Core50 result (Issue #1)

Hi jiaolifengmi, my apologies that it took so long, but I have finally found the error that caused the replay methods to run incorrectly when using the compare_all.py-script. The error was that in this script, all replay methods were erroneously combined with the method "CWR-plus", because the args.cwr_plus flag was never turned off. I have now fixed this by adding this line: https://github.com/GMvandeVen/class-incremental-learning/blob/cc51706a58ea8c671054e68fb4f0d08173603600/compare_all.py#L157

This should fix the inconsistencies you encountered. I did some quick checks and it seems to work fine for me now. Please let me know if you still encounter any issues.

Many thanks for raising this issue!

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

GMvandeVen commented 1 year ago

Hi jiaolifengmi, thank you for pointing that out. You are right that there still seems to be an issue with the BI-R and BI-R+SI results if you run this script. I really should have tested this script better, my apologies. I have tried to find what is wrong but so far I haven't been able to find it. When cleaning up the code for this repository I must have changed something. If you use the code in this repository, you can get results for BI-R and BI-R+SI on the class-incremental version of Split CIFAR-100 that are consistent with the results reported in the paper (if you use the provided pre-trained convolutional layers, which are the same ones as in this repository; you can do this with the following command: ./compare_CIFAR100.py --scenario=class --seed=11 --n-seeds=10).

I will keep trying to find out what is wrong with the implementation of BI-R in this repository; I will let you know if I find it. For now, I added a note to the README to explain this issue. Sorry for the inconvenience!