Exemplar-free class-incremental learning (CIL) poses several challenges since it prohibits the rehearsal of data from previous tasks and thus suffers from catastrophic forgetting. Recent approaches to incrementally learning the classifier by freezing the feature extractor after the first task have gained much attention. In this paper, we explore prototypical networks for CIL, which generate new class prototypes using the frozen feature extractor and classify the features based on the Euclidean distance to the prototypes. In an analysis of the feature distributions of classes, we show that classification based on Euclidean metrics is successful for jointly trained features. However, when learning from non-stationary data, we observe that the Euclidean metric is suboptimal and that feature distributions are heterogeneous. To address this challenge, we revisit the anisotropic Mahalanobis distance for CIL. In addition, we empirically show that modeling the feature covariance relations is better than previous attempts at sampling features from normal distributions and training a linear classifier. Unlike existing methods, our approach generalizes to both many- and few-shot CIL settings, as well as to domain-incremental settings. Interestingly, without updating the backbone network, our method obtains state-of-the-art results on several standard continual learning benchmarks.
@inproceedings{goswami2023fecam,
title={FeCAM: Exploiting the Heterogeneity of Class Distributions in Exemplar-Free Continual Learning},
author={Dipam Goswami and Yuyang Liu and Bartłomiej Twardowski and Joost van de Weijer},
booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
year={2023}
}
Refer to fecam.py for the method and fecam.json for setting the configurations.
Refer to fecam.py for the FeCAM classifier code and update_fecam.py for the utils with adiitional settings to explore using FeCAM classifier with memory buffer and also the oracle setting (upper bound when computing mean and covariance matrix from all old data seen so far).
The framework for many-shot CIL setting is taken from PyCIL.
We performed experiments for CIFAR100
, ImageNet100,
and TinyImageNet
. When training on CIFAR100
, this framework will automatically download it. When training on ImageNet100
or TinyImageNet
, you should specify the folder of your dataset in utils/data.py
.
def download_data(self):
train_dir = '[DATA-PATH]/train/'
test_dir = '[DATA-PATH]/val/'
To download ImageNet-Subset dataset: Link
Run the following command for FeCAM
python main.py --config=exps/FeCAM_{dataset}.json
ResNet18
for all the experiments .Other algorithm-specific hyperparameters can be modified in the corresponding json files. There are options to use NCM Classifier instead of FeCAM.
The trained weights for the first task used in our experiments can be found here
Download the ImageNet-R and CoRe50 datasets.
python FeCAM_vit_{dataset}.py
python NCM_vit_{dataset}.py
FeCAM can be used in combination with differnt few-shot learning approaches. In our paper, we use FeCAM with two recent works, ALICE and FACT.
The code can be used as a plug-in in different codebases by adding the two components: the FeCAM classifier from models/base.py and a utils function which performs the transformations and computes the covariance matrices like in utils/maha_utils.py.