suinleelab / vit-shapley

26 stars 6 forks source link

About the training of the classifier #13

Open Smiling-Weeping-zhr opened 3 months ago

Smiling-Weeping-zhr commented 3 months ago

Hello authors, we have reproduced your code. We loaded the classifier of ImageNette, and the accuracy could reach 0.99, but we trained from scratch without loading any weights, and the result was only 0.73, and we used other efficient web frameworks for classification, and the accuracy was only 0.85. Can you tell me how you train, thank you very much。

chanwkimlab commented 2 months ago

Hi, thanks for your interest in our work! As the ImageNette dataset contains only 10 easily distinguishable classes, achieving an accuracy over 0.9 is typically easy. If your model's accuracy is only 0.73, there might be something wrong. For more context, you can check the ImageNette leaderboard here. It's difficult to debug the exact issues based on the information provided, but one possible explanation could be that you're using a version of the ImageNette dataset with label noise.

iancovert commented 2 months ago

Another possibility is that your performance is worse due to training from a random initialization (I'm guessing this is what you mean by training "from scratch"). ViTs are more difficult to train due to less inductive bias (see this paper for example), so it's common to use pre-trained weights (e.g., from ImageNet classification) unless you have a lot of data.

Smiling-Weeping-zhr commented 2 months ago

Thanks very much