Closed rainy1551 closed 1 month ago
Hi, thank you :D The code should be mostly the same. However, for smaller datasets (for example CIFAR) aggressive data augmentation would be much less effective; using rand augment will probably hurt the performance. Training for a long time is also quite important to get the best results from distillation, but if not you can always use some better loss functions (not MSE). Something like https://github.com/roymiles/Simple-Recipe-Distillation/blob/main/deit/models.py#L242 will work, though we didn't use it here in favour of simplicity. SGD with a lr schedule tends to give better results on CIFAR than Adam. These are just my empirical observations when working with smaller datasets.
As far as characteristics of the dataset itself goes, I'm not really sure. All the datasets I have looked at are class balanced. If there is a class inbalance, maybe using some long-tail tricks would help.
In terms of compatbility with our code, we just use the huggingface datasets
library: https://github.com/roymiles/vkd/blob/4b480506d10bad9bfaf27b144f5929ad4007472d/train.py#L84C22-L84C62
HuggingFace provide a tutorial for making your custom dataset which is compatable with this interface: https://huggingface.co/docs/datasets/en/create_dataset
Or you can use any of the existing datasets publically available: https://huggingface.co/datasets
In terms of compatbility with our code, we just use the huggingface
datasets
library: https://github.com/roymiles/vkd/blob/4b480506d10bad9bfaf27b144f5929ad4007472d/train.py#L84C22-L84C62HuggingFace provide a tutorial for making your custom dataset which is compatable with this interface: https://huggingface.co/docs/datasets/en/create_dataset
Or you can use any of the existing datasets publically available: https://huggingface.co/datasets
Thanks a lot for your detailed reply!
Hi,
Thanks for your outstanding work—it’s truly impressive! I’m interested in experimenting with your code using some smaller datasets. Would it be possible? If yes, could you please advise on the characteristics that these datasets should have to ensure compatibility with your code?
Thank you for your help!