ContinualAI / avalanche

Avalanche: an End-to-End Library for Continual Learning based on PyTorch.
http://avalanche.continualai.org
MIT License
1.71k stars 280 forks source link

Cannot create a dataset using the code in notebooks/from-zero-to-hero-tutorial/03_benchmarks.ipynb and the classification dataset cannot switch between train and eval transforms #1648

Open blmussati opened 1 month ago

blmussati commented 1 month ago

These are two issues related to one another.

🐛 Describe the bug Bug 1: as_classification_dataset is given 2 arguments in the notebook, but according to the function definition, only one should be provided. transform_groups is not a valid input. Bug 2: even if we give both train and eval transform_groups to an Avalanche dataset, we can't switch between train/eval transform in that dataset.

🐜 To Reproduce Bug 1

train_transforms = torchvision.transforms.ToTensor()
eval_transforms = torchvision.transforms.Compose([
    torchvision.transforms.ToTensor(),
    torchvision.transforms.Resize((32, 32))
])

train_MNIST = as_classification_dataset(
    train_MNIST,
    transform_groups={
        'train': train_transforms, 
        'eval': eval_transforms
    }
)
test_MNIST = as_classification_dataset(
    test_MNIST,
    transform_groups={
        'train': train_transforms, 
        'eval': eval_transforms
    }
)

Bug 2

# Assume train_MNIST is a classification AvalancheDataset

train_MNIST.train()
print(train_MNIST[0][0].shape)
# torch.Size([1, 28, 28])

train_MNIST.eval()
print(train_MNIST[0][0].shape)
# torch.Size([1, 28, 28]) <-- should be torch.Size([1, 32, 32])

🐝 Expected behavior For Bug 1 I tried modifying the code in the following ways

from avalanche.benchmarks.utils import make_classification_dataset

train_MNIST = make_classification_dataset(
        train_MNIST, 
        transform_groups=transform_groups
    )
eval_MNIST = make_classification_dataset(
        eval_MNIST,
        transform_groups=transform_groups
    )

and

train_MNIST = as_classification_dataset(train_MNIST)
train_MNIST.replace_current_transform_group(transform_groups)

eval_MNIST = as_classification_dataset(eval_MNIST)
eval_MNIST.replace_current_transform_group(transform_groups)

and

train_MNIST = AvalancheDataset(datasets=train_MNIST, transform_groups=transform_groups)
eval_MNIST = AvalancheDataset(datasets=eval_MNIST, transform_groups=transform_groups)
train_MNIST = as_classification_dataset(train_MNIST)
eval_MNIST = as_classification_dataset(eval_MNIST)
  1. Approach using make_classification_dataset runs, but the functionality of switching between transform groups doesn't work (Bug 2)
  2. Approach calling as_classification_dataset followed by .replace_current_transform_group(transform_groups) doesn't properly wraps the dataset and when enumerating train_MNIST the examples are PIL.Images.
  3. Approach calling AvalancheDataset followed by as_classification_dataset gives a DeprecationWarning
    DeprecationWarning: AvalancheDataset constructor has been changed. Please check the documentation for the correct usage. You can use `avalanche.benchmarks.utils.make_classification_dataset if you need the old behavior.

    and similarly to Approach 1, the functionality of switching between transform groups doesn't work (Bug 2).

It seems that Approach 1 may be the correct technique to create a classification AvalancheDataset, but the transformations are not stored in the groups.

For Bug 2 If we could switch into eval transform group, we expect

train_MNIST.eval()
print(train_MNIST[0][0].shape)
# torch.Size([1, 32, 32])

🐞 Screenshots Bug 1

Screenshot 2024-05-29 at 14 36 31

🦋 Additional context I'm using avalanche-lib 0.4.0 and torch 1.13.1