Closed vlomonaco closed 3 years ago
Hi, I'd like to work on this issue as a first issue.
If I understand correctly, a wrapper to pytorchcv
should be implemented, with the necessary options to choose between pretrained and newly initialized networks?
Hi @VerwimpEli! I think we should so something like what we did for the PyTorch Datasets: https://github.com/ContinualAI/avalanche/blob/master/avalanche/benchmarks/datasets/torchvision_wrapper.py
The pretrained
flag should be already available for all the pytorchcv pre-trained models.
Yes. I agree it should be similar to the way PyTorch Datasets are loaded. But since they already have almost a hundred different classification base models (without counting e.g. the numerous variations on Resnet depths etc.), it might get a bit of a hassle to write separate functions for them all?
We could limit it to some popular models like Resnet, VGG, Inception... or provide a general method load_model(name, pretrained=False, **kwargs)
or something similar. I think I'd prefer to former, to keep the code base cleaner. If someone wanted to use a more exotic model, they could load it through pytorchcv
themselves anyway.
It makes sense. Maybe you can wrap just the most popular models explicitly, then wrap the get_model
method that pytorchcv already offers for max flexibility:
from pytorchcv.model_provider import get_model as ptcv_get_model
import torch
from torch.autograd import Variable
net = ptcv_get_model("resnet18", pretrained=True)
x = Variable(torch.randn(1, 3, 224, 224))
y = net(x)
Alright, I'll try to implement it that way!
Hey!
I've got a basic implementation of this down, but there are a few design decisions I'd like to discuss.
dataset
for common datasets and then modify those layers in the wrapper. Or leave this to the user, but I think I'd be a nice feature to have for at least CIFAR10/100.pytorchcv
to get this though. Which would again be a nice feature to have. That's all for now :)
Most models are only available for imagenet. So they'd require a modification of the input and output layer before they can be applied on different datasets. We could support a parameter dataset for common datasets and then modify those layers in the wrapper. Or leave this to the user, but I think I'd be a nice feature to have for at least CIFAR10/100.
I think both options are good, it's up to you!
Pretraining: should the option exist to load a model pretrained on a different dataset than the one that's used in training?
Yes, I think so. However, in the doc we can specify the changes we operate to work on a different dataset (like a new head zero-initialized, etc.)
The commonly used slimmed down version of Resnet18 (from GEM, with less features per layer) isn't availble directly. It would be possible to use lower level methods of pytorchcv to get this though. Which would again be a nice feature to have.
Agreed! @lrzpellegrini implemented for our past research, maybe you can share the code to define it here?
Alright! I already have an implementation of the slimmed down net, so no need to share an additional one :)
Hi @VerwimpEli, I don't know if it's the same architecture of the one used in GEM, but I implemented a slimmed down ResNet some time ago an it seems to work for the Cifar experiements of iCaRL.
https://github.com/lrzpellegrini/icarl-pytorch/blob/master/models/icarl_net.py
I agree that this is a quite nice addition. We also implemented in our framework to be able to use all torchvision models, as well as allow custom models and it's been very useful, specially to be able to use architectures available directly from pytorch which are standard (and their pretrained versions).
As mentioned above, most torchvision models are for ImageNet input-size. In class-IL, for CIFAR sized datasets we use the proposed version of ResNet-32 (from the original ResNet paper) for evaluating multiple class-incremental methods [code here]. Recently, another paper proposed to use a reduced version of wideresnet: WRN-16-2 [code here]. So I would propose to have ResNet-32 available for testing the class-incremental scenarios. It would help with the implementations I am working on, to see if the results are similar to the ones in the papers.
Finally, note that for most of those smaller input-size networks, there is usually not an official pretrained model. If we would want to provide some or use the ones from available published works, we would probably need to host the model so that it can be downloaded from Avalanche (similarly as how torchvision models have the option to download the pretrained version).
As for hosting data or models, no problem, we will soon have a custom server to support avalanche development.
Also @VerwimpEli for the "custom" models try to follow the new #535 so that we can use them with the multi-head plugin.
An example of how to do that is in models/simple_cnn.py
We should add direct import for pytorchcv (pre-trained) models through the models module.