Open khuangaf opened 2 years ago
Hi @khuangaf , this codebase is no longer supported, sorry.
In https://github.com/facebookresearch/DPR/blob/main/dpr/models/biencoder.py#L194-L196,
hard_negatives_start_idx
is set to 1, which seems to assume thatlen(neg_ctxs) == 0
? In order forhard_neg_ctx_indices
to index the hard negative passages inall_ctxs
( https://github.com/facebookresearch/DPR/blob/main/dpr/models/biencoder.py#L207-L215), shouldn't we dohard_negatives_start_idx = len(neg_ctxs) + 1 hard_negatives_end_idx = len(neg_ctxs)+ len(hard_neg_ctxs) + 1
? Or perhaps I misunderstood the code here?
Hi, I'm just puzzled by the same problem and I think you are right.
See #181, #176 and #92 for reference.
The logic with hard negatives indexes is indeed incorrect in the mentioned lines. BUT, there is no usage of them in any downstream code. It was added for an experiment which was later deleted from the final code while hard_negatives indexes are still there but play no role.
In https://github.com/facebookresearch/DPR/blob/main/dpr/models/biencoder.py#L194-L196,
hard_negatives_start_idx
is set to 1, which seems to assume thatlen(neg_ctxs) == 0
? In order forhard_neg_ctx_indices
to index the hard negative passages inall_ctxs
( https://github.com/facebookresearch/DPR/blob/main/dpr/models/biencoder.py#L207-L215), shouldn't we do? Or perhaps I misunderstood the code here?