Open Akhil-Gurram opened 11 months ago
Hello @Akhil-Gurram ,
pay eventually attention to the last fc layer of the model. If the current model, before loading the weights from the checkpoint, has such a layer at its end, its size ([512, 85742]) should be different from the size of the fc-layer in the checkpoint ([512, 205990]). That's normal, since you are using another dataset with 85742 classes, I assume.
It could therefore probably help to skip the weights of this fc-layer when loading the parameters from the checkpoint.
P.S.: I want to do that too, but I'm still encountering the problem that I don't know exactly how I should structure my own data set for the evaluation so that it works with Adaface. Do you have any ideas? Same question for the training set, since I use my own training set. Do you use a training list?
Firstly, congrats on the paper and great results.
Currently, I face an issue when I resume the training with the pre-trained model (R50-MS1Mv2) provided by you, but when I run the script to train a model from scratch it works without any issues.
Any suggestion on how to fix it? Thank you,