Closed JuanDavidG1997 closed 4 years ago
Well it is possible, but if it will be effective I do not know. https://github.com/karanchahal/distiller/blob/master/models/model_factory.py#L9 contains a list of our models and parameter sizes. Even the smallest still has around 80k parameters. At that level it is difficult to come up with an architecture that is competitive for complex tasks, but that is the challenge with model compression or knowledge distillation.
For a project I am comparing pruning and distillation. I would like to train a model of about 15000 params, but no resnet/wrn has so few parameters. Is it possible to train this type of model?