karanchahal / distiller

A large scale study of Knowledge Distillation.
217 stars 30 forks source link

Is it possible to train a smaller model (student)? #4

Closed JuanDavidG1997 closed 4 years ago

JuanDavidG1997 commented 4 years ago

For a project I am comparing pruning and distillation. I would like to train a model of about 15000 params, but no resnet/wrn has so few parameters. Is it possible to train this type of model?

fruffy commented 4 years ago

Well it is possible, but if it will be effective I do not know. https://github.com/karanchahal/distiller/blob/master/models/model_factory.py#L9 contains a list of our models and parameter sizes. Even the smallest still has around 80k parameters. At that level it is difficult to come up with an architecture that is competitive for complex tasks, but that is the challenge with model compression or knowledge distillation.