mit-han-lab / once-for-all

[ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment
https://ofa.mit.edu/
MIT License
1.89k stars 333 forks source link

How is the accuracy of the teacher model (once-for-all model)? #14

Open guvcolie opened 4 years ago

guvcolie commented 4 years ago

Thank you for your excellent code! You use teacher-student distilling method when training sub-models, how is the accuracy of the teacher model (kernel size is 7, expansion is 6 and 4 layers in each unit)?