What does get_accuracy() method mean?

shwangtangjun commented 4 years ago

In train_lshot.py, you evaluate the accuracy in an "unnatural" way. However, I don't understand what do you mean in the notation that "the labels L1 and L2 may be different"?

L1 is the test label, which is just a repeated sequence of several classes like [4,4,4,4,12,12,12,12,3,3,3,3]. L2 is the output, which is the argmax taken from the support label, where support label is a unique sequence [4,12,3]. So output may be like [4,4,3,4,12,12,4,12,4,3,3,3]

So the accuracy should simply be 9/12=75% right? What are you doing with the Hungarian method, which you never mentioned in the paper, and does not appear in the SimpleShot github?

imtiazziko commented 4 years ago

Hi @shwangtangjun. Thanks for the comment.

I just added it to ensure the match of output labels according to the ground labels as it is normally done in evaluating clustering accuracy. Actually in Inat experiment I did not use it.

Thanks.

shwangtangjun commented 4 years ago

Thanks for the reply.

However, when I change the get_accuracy() method to the ordinary way of calculating accuracy, which is acc=(out==test_label).mean(). There is a performance drop.

I experiment on the mini-ImageNet dataset, with your pretrained ResNet18 and Wide ResNet models. The following are 1-shot accuracy results. The former is accuracy from get_accuracy() method, and the latter is from the ordinary one. ResNet18: 72.11 --> 70.37 Wide ResNet: 74.86 --> 73.36

I think the latter results should be the real accuracy. Could you help explain the difference?

imtiazziko commented 4 years ago

Let me give you an example, why I mentioned to match the cluster labels with the ground truth labels:

LaplacianShot can be seen as a constraint clustering method. So when we do the LaplacianShot in an unsupervised manner, there is no unique connection between a cluster (a group of elements) and a specific label (eg. “1”).

So Let say after running Laplacianshot for with features coming from 5 classes, we get the output (l) as following:

l = [0 4 4 4 0 4 4 4 4 4 4 4 4 4 4 4 1 4 4 1 1 4 2 2 4 2 4 2 4 4 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 4 3 3 3 3 4 3 3 3 2 3 3 3 3 3 2 2 4 4 4 4 4 4 4 4 4 2 2 2 4]
Now support class labels are : support_label = [7, 0, 12, 4, 1]
With line 588: we have according to the support labels:

out =[ 7, 1, 1, 1, 7, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 12, 12, 1, 12, 1, 12, 1, 1, 12, 12, 12, 12,12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 1, 4, 4, 4, 4, 1, 4, 4, 4, 12, 4, 4, 4, 4, 4, 12, 12, 1, 1, 1, 1, 1, 1, 1, 1, 1, 12, 12, 12, 1]

But the ground truth test labels we have is:

test_label =[ 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Now if we compare the labels (out) with ground truth, you can check that there is a label mismatch of cluster output (out) with respect to the ground truth labels (test_label). For example, the 1's in the output (out) should be changed to 7's and vice versa. This is needed before computing the accuracy with respect to ground truth labels. This is a linear assignment problem and thus I use Hungarian method to get the accurate labelling before computing accuracy with respect to ground truth which gives us the following new output labels (newL2): newL2 = [ 1., 7., 7., 7., 1., 7., 7., 7., 7., 7., 7., 7., 7., 7., 7., 7., 0., 7., 7., 0., 0., 7., 12., 12., 7., 12., 7., 12., 7., 7., 12., 12., 12., 12., 12., 12., 12., 12., 12., 12., 12., 12., 12., 12., 12., 7., 4., 4., 4., 4., 7., 4., 4., 4., 12., 4., 4., 4., 4., 4., 12., 12., 7., 7., 7., 7., 7., 7., 7., 7., 7., 12., 12., 12., 7.]

So, even though two solutions (new_L2 and out) match perfectly, the assigned labels are different. And that is why there is the mismatch in accuracy with the way you do. This is a group wise transductive prediction we do with LaplacianShot, and thus we do it like that. I hope you understand why I use it now.

This is a technical details and Hungarian method is commonly used in computing clustering accuracy. So I omitted it in the paper.

Thanks.

shwangtangjun commented 4 years ago

Thanks, I understand now. Nice work!

shwangtangjun commented 3 years ago

After reading several other papers in few-shot learning also based on clustering, I am suspicious of whether it is legal to adjust the output according to ground truth label, which from my perspective is a leak of label information.

In your example, you know that there is a label mismatch between "out" and "test_label" because you have access to the ground truth label before testing. However, under such circumstances, why can't we let out=test_label and achieve 100% accuracy?

In typical clustering problem where we can use Hungarian method, we use two unsupervised method to solve and get two similar clusters, but the clusters may be assigned different id (id is just used to discriminate cluster, it is different from label) . For example: a=[1,1,2,2,3,3,4,4] and b=[3,3,4,4,1,1,2,2]. Note that NO labels is used here. It is natural observation that cluster a and cluster b is identical.

However, in your setting, 5-way-K-shot few shot learning, we have labels! In the support set, we are already provided with 5 class labels, and we assign labels to the query samples according to the Laplacian regularized algorithm in this paper. In your example, If you mutate 1 and 7 in out, then you should also mutate 1 and 7 in support_label, which in turn mutate 1 and 7 in test_label. As a result, the accuracy will not change a bit.

Could you explain the legitimacy of your get_accuracy() method? I strongly doubt it is a leak of label.

shwangtangjun commented 3 years ago

A simple example. Two classes: circle and triangle, one shot problem, three test samples in each class. Red ones are labeled samples (the shot), blue ones are test samples. Here is the ground truth.

Suppose after some (bad) clustering method, we get such output.

What should be the accuracy? For me it is 0%. If you disagree on this and claim the accuracy should be 100%, then you can stop reading the following paragraphs. The fundamental difference lies in that we have labeled examples in few-shot learning, unlike traditional clustering. If you just impose a mapping on blue test samples and put aside red labeled samples, then you omit a crucial information: near samples No.1 and No.2 belong to different classes in the original output, but belong to the same class in the transformed output. This is a total mistake. If you agree that the accuracy is 0%, then let's see what we will get if we use a mapping that exchanges circles and triangles. In your opinion, you may compare the lower two pictures and conclude the accuracy become 100%. However, the mapping should not only be imposed on the output, but also on the ground truth, otherwise the same red labeled sample in two pictures have different labels. We should thereby compare the following two thus the accuracy is still 0%. That does not change after the mapping.

"Hungarian method is commonly used in computing clustering accuracy". That's true, but I haven't seen any other paper or code in few-shot learning using hungarian method. So I doubt the correctness.

On miniImageNet, the 1shot and 5shot accuracy should be ResNet18: 72.11 82.31 -> 70.37 82.28 WideResNet: 74.86 84.13 -> 73.36 84.11 which is still quite a good result.

imtiazziko commented 3 years ago

Ok. Thanks a lot for the explanation. Could you tell me how do you calculate the accuracy without the need of mapping?

shwangtangjun commented 3 years ago

Like I said before, acc=(out==test_label).mean(). Here out is the original output.

imtiazziko / LaplacianShot

What does get_accuracy() method mean? #1