Open Smiling-Weeping-zhr opened 3 months ago
Hi, thanks for your interest in our work! As the ImageNette dataset contains only 10 easily distinguishable classes, achieving an accuracy over 0.9 is typically easy. If your model's accuracy is only 0.73, there might be something wrong. For more context, you can check the ImageNette leaderboard here. It's difficult to debug the exact issues based on the information provided, but one possible explanation could be that you're using a version of the ImageNette dataset with label noise.
Another possibility is that your performance is worse due to training from a random initialization (I'm guessing this is what you mean by training "from scratch"). ViTs are more difficult to train due to less inductive bias (see this paper for example), so it's common to use pre-trained weights (e.g., from ImageNet classification) unless you have a lot of data.
Thanks very much
Hello authors, we have reproduced your code. We loaded the classifier of ImageNette, and the accuracy could reach 0.99, but we trained from scratch without loading any weights, and the result was only 0.73, and we used other efficient web frameworks for classification, and the accuracy was only 0.85. Can you tell me how you train, thank you very much。