Use of cosine similarity in CompCos

hanoonaR commented 2 years ago

Hi, Thank you for sharing your great work. I have some questions regarding the the paper "Open World Compositional Zero-Shot Learning", which introduces Compositional Cosine Logits (CompCos).

It explains that the image features and compositions are projected into a shared semantic space. The objective function is a cross entropy loss, where the logits of classification layer are replaced by the cosine similarity between the image features and compositional embeddings in the shared embedding space. However, in the code, the projected image features are just multiplied with the projected compositional embeddings, and cross entropy is applied directly.

pair_pred = torch.matmul(img_feats_normed, pair_embed) loss_cos = F.cross_entropy(self.scale * pair_pred, pairs) https://github.com/ExplainableML/czsl/blob/6582c0f5de82671a8750da95f4dc280c9bef5213/models/compcos.py#L253

Is there something that I have misunderstood? It would be great if you could point out where the cosine similarity is applied in the loss function. Thank you.

mancinimassimiliano commented 2 years ago

Hello @hanoonaR, and thanks for your interest in our work!

In compcos we perform the dot product between two matrices with normalized rows, i.e. img_feats_normed (normalized here) and pair_embed (normalized here).

Since the rows/vectors are unit-norm, their dot-product is equivalent to their cosine similarity.

hanoonaR commented 2 years ago

Thank you for the quick response and explanation. Loved your work.

ExplainableML / czsl

Use of cosine similarity in CompCos #14