ContinualAI / avalanche

Avalanche: an End-to-End Library for Continual Learning based on PyTorch.
http://avalanche.continualai.org
MIT License
1.79k stars 291 forks source link

Add Models Available in PyTorchCV #430

Closed vlomonaco closed 3 years ago

vlomonaco commented 3 years ago

We should add direct import for pytorchcv (pre-trained) models through the models module.

VerwimpEli commented 3 years ago

Hi, I'd like to work on this issue as a first issue.

If I understand correctly, a wrapper to pytorchcv should be implemented, with the necessary options to choose between pretrained and newly initialized networks?

vlomonaco commented 3 years ago

Hi @VerwimpEli! I think we should so something like what we did for the PyTorch Datasets: https://github.com/ContinualAI/avalanche/blob/master/avalanche/benchmarks/datasets/torchvision_wrapper.py

The pretrained flag should be already available for all the pytorchcv pre-trained models.

VerwimpEli commented 3 years ago

Yes. I agree it should be similar to the way PyTorch Datasets are loaded. But since they already have almost a hundred different classification base models (without counting e.g. the numerous variations on Resnet depths etc.), it might get a bit of a hassle to write separate functions for them all?

We could limit it to some popular models like Resnet, VGG, Inception... or provide a general method load_model(name, pretrained=False, **kwargs) or something similar. I think I'd prefer to former, to keep the code base cleaner. If someone wanted to use a more exotic model, they could load it through pytorchcv themselves anyway.

vlomonaco commented 3 years ago

It makes sense. Maybe you can wrap just the most popular models explicitly, then wrap the get_model method that pytorchcv already offers for max flexibility:

from pytorchcv.model_provider import get_model as ptcv_get_model
import torch
from torch.autograd import Variable

net = ptcv_get_model("resnet18", pretrained=True)
x = Variable(torch.randn(1, 3, 224, 224))
y = net(x)
VerwimpEli commented 3 years ago

Alright, I'll try to implement it that way!

VerwimpEli commented 3 years ago

Hey!

I've got a basic implementation of this down, but there are a few design decisions I'd like to discuss.

That's all for now :)

vlomonaco commented 3 years ago

Most models are only available for imagenet. So they'd require a modification of the input and output layer before they can be applied on different datasets. We could support a parameter dataset for common datasets and then modify those layers in the wrapper. Or leave this to the user, but I think I'd be a nice feature to have for at least CIFAR10/100.

I think both options are good, it's up to you!

Pretraining: should the option exist to load a model pretrained on a different dataset than the one that's used in training?

Yes, I think so. However, in the doc we can specify the changes we operate to work on a different dataset (like a new head zero-initialized, etc.)

The commonly used slimmed down version of Resnet18 (from GEM, with less features per layer) isn't availble directly. It would be possible to use lower level methods of pytorchcv to get this though. Which would again be a nice feature to have.

Agreed! @lrzpellegrini implemented for our past research, maybe you can share the code to define it here?

VerwimpEli commented 3 years ago

Alright! I already have an implementation of the slimmed down net, so no need to share an additional one :)

lrzpellegrini commented 3 years ago

Hi @VerwimpEli, I don't know if it's the same architecture of the one used in GEM, but I implemented a slimmed down ResNet some time ago an it seems to work for the Cifar experiements of iCaRL.

https://github.com/lrzpellegrini/icarl-pytorch/blob/master/models/icarl_net.py

mmasana commented 3 years ago

I agree that this is a quite nice addition. We also implemented in our framework to be able to use all torchvision models, as well as allow custom models and it's been very useful, specially to be able to use architectures available directly from pytorch which are standard (and their pretrained versions).

As mentioned above, most torchvision models are for ImageNet input-size. In class-IL, for CIFAR sized datasets we use the proposed version of ResNet-32 (from the original ResNet paper) for evaluating multiple class-incremental methods [code here]. Recently, another paper proposed to use a reduced version of wideresnet: WRN-16-2 [code here]. So I would propose to have ResNet-32 available for testing the class-incremental scenarios. It would help with the implementations I am working on, to see if the results are similar to the ones in the papers.

Finally, note that for most of those smaller input-size networks, there is usually not an official pretrained model. If we would want to provide some or use the ones from available published works, we would probably need to host the model so that it can be downloaded from Avalanche (similarly as how torchvision models have the option to download the pretrained version).

vlomonaco commented 3 years ago

As for hosting data or models, no problem, we will soon have a custom server to support avalanche development. Also @VerwimpEli for the "custom" models try to follow the new #535 so that we can use them with the multi-head plugin. An example of how to do that is in models/simple_cnn.py