learnables / learn2learn

A PyTorch Library for Meta-learning Research
http://learn2learn.net
MIT License
2.61k stars 350 forks source link

custom dataset shuffle #338

Closed ambekarsameer96 closed 2 years ago

ambekarsameer96 commented 2 years ago

Hi if I am using the following for the custom dataset, does it ensure that the training samples across all classes are being shuffled for every iteration in the loop as shown below? (Since there is no transform for shuffling)

train_dataset = l2l.data.MetaDataset(trainset)
    transforms = [
        l2l.data.transforms.NWays(train_dataset, ways),
        l2l.data.transforms.KShots(train_dataset, 5*shots),
        l2l.data.transforms.LoadData(train_dataset),
        l2l.data.transforms.RemapLabels(train_dataset),

    ]
    taskset = l2l.data.TaskDataset(train_dataset, transforms, num_tasks = 1000)

After this, I directly make use of a loop like this: (Since task.sampler() is not available for custom dataset)


for iter in max_iterations:
    for counter, (inputs, targets ) in enumerate(taskset):
        maml.clone()
        eval_error , eval_accuracy = fast_adapt(data, labels, learner, loss, adaptation_steps, shots, ways, device)
        if counter==max_tasks:
            break
ambekarsameer96 commented 2 years ago

@seba-1511 can you please take a look.

seba-1511 commented 2 years ago

Yes, correct since NWays and KShots sample their data randomly (akin to shuffling, but not quite the same).

ambekarsameer96 commented 2 years ago

Hello @seba-1511, Thanks for the repo and the examples. I have a doubt about maml-miniimagenet example -

# Separate data into adaptation/evalutation sets adaptation_indices = np.zeros(data.size(0), dtype=bool) adaptation_indices[np.arange(shots*ways) * 2] = True evaluation_indices = torch.from_numpy(~adaptation_indices) adaptation_indices = torch.from_numpy(adaptation_indices) adaptation_data, adaptation_labels = data[adaptation_indices], labels[adaptation_indices] evaluation_data, evaluation_labels = data[evaluation_indices], labels[evaluation_indices]

From- https://github.com/learnables/learn2learn/blob/f099ddc9ce0c10cff901ecb1acee2838d171272e/examples/vision/maml_miniimagenet.py#L99

Can we use sklearn.model_selection.train_test_split with stratify instead which allows to split the samples based on the classes that is it ensures that every class has at least one sample in train and test split?