Open alirezazareian opened 6 years ago
Hi Guys, Does anybody how many training epochs and episodes/epoch was used to reproduce the paper's results ?
I have the same problem. The re-implementation results are much lower than the reported results on miniImageNet dataset.
How to run the code with miniImageNet dataset ? I just replace the line: default_dataset = 'miniImagenet', but it doesn't work.
I have implemented one which got 49.1/66.9, still slightly worse than the paper's.
You could check the code if it helps.
A simple modification to reproduce the results is scaling the outputs of the euclidean distance. That is,
feature_dims = 1600 # 1600 for miniimagenet, 64 for omniglot
learnable_scale = nn.Parameter(torch.FloatTensor(1).fill_(1.0), requires_grad=True)
dist = learnable_scale * euclidean_dist(x, y) / 1600
In this way, I am able to get 1-shot: 50.87% 5-shot: 68.21%
@bilylee Hi, I tried this way, but still got little improvement in the 5-shot scenario (specifically 67.1%). This is a code snippet
class Convnet(nn.Module):
def __init__(self, x_dim=3, hid_dim=64, z_dim=64):
super().__init__()
self.encoder = nn.Sequential(
conv_block(x_dim, hid_dim),
conv_block(hid_dim, hid_dim),
conv_block(hid_dim, hid_dim),
conv_block(hid_dim, z_dim),
)
self.out_channels = 1600
self.scale = nn.Parameter(torch.FloatTensor(1).fill_(1.0), requires_grad=True)
def forward(self, x):
x = self.encoder(x)
return x.view(x.size(0), -1)
def loss(self, data, num_way, num_support, num_query):
p = num_support * num_way
data_shot, data_query = data[:p], data[p:]
proto = self.forward(data_shot)
proto = proto.reshape(num_support, num_way, -1).mean(dim=0)
label = torch.arange(num_way).repeat(num_query)
label = label.type(torch.cuda.LongTensor)
logits = self.scale * euclidean_metric(self.forward(data_query), proto) / self.out_channels
loss = F.cross_entropy(logits, label)
acc = count_acc(logits, label)
return loss, acc
How to run the code with miniImageNet dataset ? I just replace the line: default_dataset = 'miniImagenet', but it doesn't work.
so do you know how to run with miniImageNet now? I think there are no codes related to miniImagenet dataset.
Dear Jake,
I have been trying to reproduce your results for mini-imagenet, but there is a large gap between what I can get and what have been reported on the paper. I can get 47.21% for 5-way 1-shot, and 62.63% for 5-way 5-shot, while they should be 49.42 and 68.2. I have used both your code and my own implementation based on tensorflow. Also the code here (https://github.com/abdulfatir/prototypical-networks-tensorflow/blob/master/ProtoNet-MiniImageNet-v2.ipynb) gets similar results.
Is there any trick that I am missing? Can you point to something in the above link that should be changed to improve results? I also tried learning rate decay and it slightly helped but still a large gap.
Thanks in advance for your help; Ali