guyera / Generalized-ODIN-Implementation

24 stars 3 forks source link

Cosine similarity definition #8

Open joacojiro opened 3 years ago

joacojiro commented 3 years ago

Hi, first of all thanks so much for this code!

Next I'm looking at your cosine similarity code:

class CosineDeconf(nn.Module):

    def __init__(self, in_features, num_classes):
        super(CosineDeconf, self).__init__()
        self.h = nn.Linear(in_features, num_classes, bias= False)
        self.init_weights()

    def init_weights(self):
        nn.init.kaiming_normal_(self.h.weight.data, nonlinearity = "relu")

    def forward(self, x):
        x = norm(x)
        w = norm(self.h.weight)

        ret = (torch.matmul(x,w.T))
        return ret

In the paper h(x) is defined as: imagen

I believe ret calculates only the product of norms (divisor in cosine similarity formula). Could it be that the dividend is missing? Thanks again! Joaquin, from Argentina

guyera commented 3 years ago

norm(x) is defined here. It returns the normalized input. So norm(x) is not the norm of x; it is the unit vector pointing in the same direction as x.

Cosine similarity is the inner product of two vectors divided by the product of their norms. Because norms are scalars and inner products are linear, this can be interpreted as the inner product of the two vectors normalized, which is what ret is defined as.