Open jizhihang opened 5 years ago
Notebook 7 is supposed to be showing how to do transfer learning.
The pre-trained ResNet model was trained on the ImageNet dataset, however we want to use it on our new dataset, classifying cats vs dogs. The final layer of the ResNet model is a linear layer with 1000 output features, as ImageNet has 1000 classes. We only have 2 classes, so we need to replace that linear layer with our new one.
Now, we want to train this final layer on our task, however we don't want to fine-tune the whole model. Why? Well, if we trained the whole model it would take a while, however if we're only training the final layer, it'll train pretty quickly. The hypothesis is that the rest of the model has learned to extract features from images and we don't need to teach it how to do that again. We make sure that the model's parameters don't change by setting requires_grad = False
.
Yes, you can fine-tune the whole model and you may get slightly better results, but this will take time. Looking at the accuracy we have achieved (~97%) with only training the linear layer, I think we can see that training the rest of the model probably isn't needed.
Hi, I'm new to learning pytorch. In "7 - ResNet - Dogs vs Cats.ipynb", there is a sentence "for param in model.parameters(): param.requires_grad = False", why did you do it ? What does it mean? And, as far as I know, When the backward() need to be called, the state of the paramters must be requires_grad = True. I dont find where you restore this setting. Sorry for my English.