What is the pre-trained model for ResNet-32?

yulu0724 / SDC-IL

Semantic Drift Compensation for Class-Incremental Learning (CVPR2020)

115 stars 34 forks source link

What is the pre-trained model for ResNet-32? #3

Closed AfricanxAdmiral closed 4 years ago

AfricanxAdmiral commented 4 years ago

Hi, yulu

In the Implementation Details of the paper, you've mentioned Cifar100 is trained with ResNet-32 and without pre-training.

But in line 131 of train.py, it seems to be loading a pre-trained model for ResNet-32. Is it possible to declare how is this pre-trained model trained?

Also, setting SEED=1 (the original value) and SEED=1993 (the value from the name of the pre-trained model) with the pre-trained model gives very different results. Is this supposed to happen?

Thanks

yulu0724 commented 4 years ago

Hi,

Cifar100 and Imagenet-subset is trained without pre-training model, but classification on cifar and imagenet is difficult to train with metric learning which has been reported in several literatures. So we use softmax based ResNet-32 to train the first 50 classes (first task) as a warm-up.

To compare fairly to other methods, we always use 1993 for cifar100 and imagine-subset, and use 1 for fine-grained datasets.

Sorry for not including these details in the paper. Thanks for pointing them out.

AfricanxAdmiral commented 4 years ago

Sorry for bothering again,

I've pre-trained a model for the first 50 classes with the same order (seed 1993) which has an accuracy around 72% (which is almost the same starting point of Figure 7 from your paper). But still could not reproduce the result.

Is there any special restriction or any tricks while training the warm-up model for Cifar100 and Imagenet-subset?

Thanks

yulu0724 commented 4 years ago

That is wired. Could you try to use my provided pertained model to see if you can reproduce the result?

AfricanxAdmiral commented 4 years ago

Sure, the pre-trained model you provided did give the same result from the paper.

But here I’m trying to reproduce the whole training process and with different order of classes. So it would be very thankful if you could provide more details about how to pre-train a suitable model for this method.

Also, another question is that, how much does the expansion of feature space from 64-dim to 512-dim affect the result ? Is this a necessity for producing a compatible result to the state of the arts ?

Thanks

yulu0724 commented 4 years ago

hi, sure. The network is the same as for training metric learning (resnet32 for cifar and resnet18 for imagenet). The learning rate is 1e-3, train step is 200, seed is 1993.

I didn't do the analysis on how the dimension affects the results. But it would be interesting to explore.