pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
16.18k stars 6.95k forks source link

Is SENet (and new architectures) welcome to `models`? #260

Open moskomule opened 7 years ago

moskomule commented 7 years ago

Hi, now torchvision.models contains some models such as ResNet and they are very helpful as baselines.

Recently I implemented SENet, which is the winner of ILSVR 2017's classification task. Can I send a PR of SENet to models?

Plus I'd like to know which models are welcome to models.

Thank you.

alykhantejani commented 7 years ago

@moskomule - Thanks.

I think this is a good question and we should, in general, define some contribution guidelines around models.

For example, I think models that are useful in more than one domain are quite useful, this tends to include most classification models trained on imagenet as they are often used for transfer learning or feature extraction.

Secondly, we should also require that the models have been trained in pytorch (not just converted) and respect the input format of images which are RGB in range [0,1].

As this architecture won ILSVR 2017 I think this could be quite a useful contribution, @fmassa what are your thoughts?

fmassa commented 7 years ago

I think SENet could be a good addition to torchvision!

But as @alykhantejani mentioned, all models in torchvision should ideally have pre-trained weights trained with pytorch following the same training procedure as in the examples/imagenet, so that it is easily reproducible. The only exception for that for the moment is inception, but we would like to keep those exceptions to a minimum.

moskomule commented 7 years ago

Thank you for your reply.

@fmassa Currently how do you prepare the pre-trained weights? Training with each PyTorch scripts, converting from LuaTorch's weight or any other ways? If there is a method to convert Caffe weights to PyTorch's weights, we can use the authors' pre-trained model.

moskomule commented 7 years ago

Sorry I skipped @alykhantejani 's reply. I've never trained on ImageNet but I think training on ImageNet takes long and so costs a lot. Currently, how do you prepare the pre-trained weights? If there exist reasonable ways to train and obtain weights, I'll try.

fmassa commented 7 years ago

Yes, it takes usually a long time to train, and might require tuning parameters but at least we know it's reproducible with our current code-base (which is also important). For example, see https://github.com/pytorch/vision/pull/49 Converting the weights from Caffe might not be best, because they are trained on BGR images ranging from 0-255, which is not what we have for pytorch. I don't think there is an out-of-the-box code available to compute the statistics for normalization, but it should be fairly easy to do, something like (untested, check the variance accumulation)

ds_mean = []
ds_var = []
# supposes all images are the same size and 3 channels
for i, img in images:
    ds_mean.append(img.view(3, -1).mean(1))
    ds_var.append(img.view(3, -1).var(1))

mean = torch.stack(ds_mean, 0).mean(0)
var = torch.stack(ds_var, 0).mean(0)
ozancaglayan commented 7 years ago

Hello,

Do you have the accuracies of these models trained with PyTorch? Are they comparable to what have been reported in their papers?

alykhantejani commented 7 years ago

Hi @ozancaglayan - currently these models have not been trained in pytorch, but if you end up doing it please do send a PR with the pretrained model :)

ozancaglayan commented 7 years ago

You mean the SENet, etc or the ImageNet pretrained VGG, ResNets pth files? I'm confused since in https://github.com/pytorch/vision/issues/28, the devs were talking like the models were being retrained in PyTorch before being added into torchvision.

alykhantejani commented 7 years ago

@ozancaglayan I meant SENet.

The pretrained models currently in torchvision have all been trained from scratch using pytorch + torchvision, except for the inception model for which the weights were transferred across.

ahkarami commented 7 years ago

Recently the Google Brain Team released a fantastic CNN model, NASNet, in TF-slim, which achieved the state-of-the-art Top-1 Accuracy on ImageNet by 82.7 %. I want to know that the PyTorch team has any plan for implement or porting this model into the PyTorch Offifcial Models (i.e., torchvision models)?

alykhantejani commented 7 years ago

Hi @ahkarami,

In torchvision we would like to have models that have been trained in pytorch using pytorch + torchvision so that they are also reproducible by the community. If somebody from the community would like to train these networks, we would be more than willing to accept a PR + the weights

PkuRainBow commented 6 years ago

@aazzolini @alykhantejani @daavoo I am wondering whether anyone could share the SENet of pytorch version?