Closed AfricanxAdmiral closed 4 years ago
Hi,
Cifar100 and Imagenet-subset is trained without pre-training model, but classification on cifar and imagenet is difficult to train with metric learning which has been reported in several literatures. So we use softmax based ResNet-32 to train the first 50 classes (first task) as a warm-up.
To compare fairly to other methods, we always use 1993 for cifar100 and imagine-subset, and use 1 for fine-grained datasets.
Sorry for not including these details in the paper. Thanks for pointing them out.
Sorry for bothering again,
I've pre-trained a model for the first 50 classes with the same order (seed 1993) which has an accuracy around 72% (which is almost the same starting point of Figure 7 from your paper). But still could not reproduce the result.
Is there any special restriction or any tricks while training the warm-up model for Cifar100 and Imagenet-subset?
Thanks
That is wired. Could you try to use my provided pertained model to see if you can reproduce the result?
Sure, the pre-trained model you provided did give the same result from the paper.
But here I’m trying to reproduce the whole training process and with different order of classes. So it would be very thankful if you could provide more details about how to pre-train a suitable model for this method.
Also, another question is that, how much does the expansion of feature space from 64-dim to 512-dim affect the result ? Is this a necessity for producing a compatible result to the state of the arts ?
Thanks
hi, sure. The network is the same as for training metric learning (resnet32 for cifar and resnet18 for imagenet). The learning rate is 1e-3, train step is 200, seed is 1993.
I didn't do the analysis on how the dimension affects the results. But it would be interesting to explore.
Hi, yulu
In the Implementation Details of the paper, you've mentioned Cifar100 is trained with ResNet-32 and without pre-training.
But in line 131 of train.py, it seems to be loading a pre-trained model for ResNet-32. Is it possible to declare how is this pre-trained model trained?
Also, setting SEED=1 (the original value) and SEED=1993 (the value from the name of the pre-trained model) with the pre-trained model gives very different results. Is this supposed to happen?
Thanks