orobix / Prototypical-Networks-for-Few-shot-Learning-PyTorch

Implementation of Prototypical Networks for Few Shot Learning (https://arxiv.org/abs/1703.05175) in Pytorch
MIT License
986 stars 210 forks source link

Mini Imagenet Results #19

Closed madiltalay closed 5 years ago

madiltalay commented 5 years ago

I have modified the codes a bit to run on mini_imagenet dataset from the link: https://github.com/renmengye/few-shot-ssl-public However, it is giving 0 error and 100% accuracy, which is strange for me. Can you please help?

madiltalay commented 5 years ago

Checking the code, line by line, I found that the problem occurs while computing the prototypical loss function. When I compute: model_output = model(x) And then: torch.max(model_output) It gives:

tensor(3.7032, device='cuda:0', grad_fn=)

And: torch.min(model_output) gives:

tensor(0., device='cuda:0', grad_fn=)

But when I compute: loss, acc = loss_fn(model_output, target=y, n_support=5) I get: loss = 0 acc = 1.0

Any help please?

madiltalay commented 5 years ago

While debugging further, I realized the real problem lies in the batch_sampler. When I load the dataset with normal dataloader: dataloader = torch.utils.data.DataLoader(dataset, batch_size=16, shuffle=True) images,labels = next(iter(dataloader)) labels It gives me:

tensor([36, 62, 18, 26, 23, 11, 42, 32, 56, 1, 18, 57, 61, 3, 34, 56])

But when I load it with the sampler provided, dataloader = torch.utils.data.DataLoader(dataset, batch_sampler=sampler) images,labels = next(iter(dataloader)) labels It gives me:

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

Hope this would help you guys to help me.

dnlcrl commented 5 years ago

While debugging further, I realized the real problem lies in the batch_sampler. When I load the dataset with normal dataloader: dataloader = torch.utils.data.DataLoader(dataset, batch_size=16, shuffle=True) images,labels = next(iter(dataloader)) labels It gives me:

tensor([36, 62, 18, 26, 23, 11, 42, 32, 56, 1, 18, 57, 61, 3, 34, 56])

But when I load it with the sampler provided, dataloader = torch.utils.data.DataLoader(dataset, batch_sampler=sampler) images,labels = next(iter(dataloader)) labels It gives me:

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

Hope this would help you guys to help me.

It looks like the problem you are describing has to do with how you initialize the batch sampler. So you should check that labels, classes_per_it, num_samples, iterations passed to the sampler init are correct.

madiltalay commented 5 years ago

I took a look at the suggested variables. labels are taken from the dataset initialized using init_dataset() classes_per_it = classes_per_it_tr in mode = train num_samples = num_support_tr + num_query_tr where, num_support_tr = 5, num_query_tr = 5 iterations = 100

But I am unable to get the problem solved. I also want to ask, that the original implementation by Snell, has two samplers: SequentialBatchSampler and EpisodicBatchSampler As I understand, the sampler under discussion is also an EpisodicBatchSampler. If that's the case, then I may give that one a try. Thanks

dnlcrl commented 5 years ago

I'm sorry but I can't do much without having access to the code. Just understand that PrototypicalBatchSampler takes all the labels and at each __call__ returns a set of indexes corresponding to num_samples * classes_per_it labels from the original labels list.