moukamisama / F2M

34 stars 8 forks source link

ImageNet pretrained weights and CFRP model #2

Open zhao34731 opened 2 years ago

zhao34731 commented 2 years ago

Hello author, Thanks for the release of the code related to your paper "Overcoming Catastrophic Forgetting in Incremental Few-Shot Learning by Finding Flat Minima" accepted by NIPS2021, which solved FSCIL problem from a specific view. After reading and runing some code related to the classifical methods like: iCaRL, LUCIR, CEC and so on. Some questions are raised as follows.

  1. pretraind weights. When training on CUB200-2011 dataset, In TOPIC and CEC, the author create the pipeline and initialize the resnet-18 network by using weights pretrained on ImageNet provided by offical pytorch library. In F2M, a specific pretrained weights " ./exp/ImageNet_bases1000/CFRPModel_res18_ImageNetall{1e-2}_SGD_001/models/best_net_latest.pth" is adopted, which may provide a strong knowledge baseline for further experiments. Could auther release this pretrained model ?
  2. Code missing. Some code might be missing, like MT_Model, CFRPModel also the incremental_training code in F2MModel. Could author release these code?
    Thanks.
moukamisama commented 2 years ago
  1. Yes, I will upload the pre-training models for CUB200-2011 and the pre-training scripts on ImageNet.
  2. I renamed these models in this version. Maybe there are some errors. I will check the code again. There are still some problems with the code as I haven't checked it yet (a bit busy), so please wait a few days. Thanks.
zhao34731 commented 2 years ago

Thank you. Another question about "exemplars" . In your paper. The author claimed that some exemplars are selected and adopted for the next session training. As I known, some former works in FSCIL only adopted the embeddings of old classes learned in former session for the current prediction. ( Use the mean of embeddings of training data extracted by the encoder and use the Cosine or L2 distance to make a classification for the current validation) . However some exemplar are saved and combined with the current training data for network update in your method. I think this provide a strong prior knowledge for the network for each update. Also the saved exeplars and the current few-shot data could formulate a balanced training set.
However, in the FSCIL problem setting. maybe the network could only obtain the current training data and the historical data could not be obtaind. I think this is a strict restriction for this problem for practical situation. So, could you provided some further experiments about the selection about the exemplars ? And how many exepmlars you used for the paper? Thanks.