facebookresearch / DPR

Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
Other
1.71k stars 300 forks source link

`cosine_scores` defined in biencoder.py does not work #232

Open xhluca opened 2 years ago

xhluca commented 2 years ago

Reference: https://github.com/facebookresearch/DPR/blob/d9f3e41bb0087687fa182a4d580711188fd82df9/dpr/models/biencoder.py#L57

F.cosine_similarity will fail to compute the similarity along a specified dimension when the other dimensions differ. For example, if x is a 10x64 tensor, and y is a 20x64 tensor, then it is expected to get a 10x20 matrix when calling cosine_scores. However, that function won't work:

>>> x = torch.randn(10, 64)
>>> y = torch.randn(20, 64)
>>> F.cosine_similarity(x, y, dim=1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: The size of tensor a (10) must match the size of tensor b (20) at non-singleton dimension 0

Since it is not used anywhere else in the repo and the paper, maybe it would be a good idea to remove cosine_scores?