Open yaox12 opened 6 years ago
I had a quick look and it seems that you might be right. Following @soumith comment on https://github.com/pytorch/vision/pull/463 , I think we will be living with this mistake, as people have already built models and got results based on the models that we provided, and changing that now would not be great.
Thanks for catching this issue though!
Adjusting the order of layers won't break the pre-trained models much, we only need to change some symbol links, for example, move classifier.1.weights
to classifier.0.weights
. There is no need to re-train the model. We load the previous model, adjust the order of layers and save it to checkpoint file. The forward inference will not change.
As PyTorch has not reach version 1.0, I think making some corrections is as important as keeping backwards compatibility. Though AlexNet is outdated today, researchs about Transfer Learning and Domain Adaptation still take it as a typical example.
I hope we can fix it.
yea, this seems worth fixing, esp. if the forward inference doesn't change. The other issue was around changing the model from 256 to 384 channels on a particular layer, so it's quite a drastic change.
We're open to a PR.
Hello,
I just realized that in the AlexNet paper the number of filters in the first conv layer is 96 (the last paragraph of page 4). But here in this repository, the module has 64 filters. I am wondering if it is implemented like this on purpose?
Thanks
@mbsariyildiz the difference you are mentioning is due to the fact that in the original implementation is different depending on the number of GPUs used, see https://github.com/pytorch/vision/pull/463#issuecomment-379214941 for more details
There are two more issues in this AlexNet implementation:
@vbogach Hi,
About point 2, that's not exactly correct. We do use max_pool
, see https://github.com/pytorch/vision/blob/5306635396d1ebd3047d56b16745350cf8c1e2c4/torchvision/models/alexnet.py#L31
but we added the avg_pool
to make the model support images of different sizes. For images of size 224x224, the avg_pool
is a no-op.
About point 1, that's correct, we do not have LRN. IIRC, this actually doesn't make much of a difference.
Given that this implementation has been around for a while, we will probably not be changing it.
In self.classifier, the correct order of layers should be fc6-relu-dropout-fc7-relu-dropout-fc8, which means that dropout layer should be after the fc and relu layers.
This does not matter when inference. However, when you fine tune AlexNet model on other datasets, the incorrect layer sequence will result in an accuracy drop of 2 percents or more.
Can I make a PR to fix it? The pre-trained model URL need to update too.