Open kinglai opened 3 years ago
It seems you're doing brute-force search.
doc_cls_reps = torch.load(os.path.join(args.doc_shard, 'cls_reps.pt')).float()
......
for batch_start in trange(0, len(all_query_offsets), batch_size, desc=shard_name):
batch_q_reps = query_cls_reps[batch_start: batch_start + batch_size]
match_scores = torch.matmul(batch_q_reps, doc_cls_reps.transpose(0, 1)) # D * b
batched_qtok_offsets = defaultdict(list)
Awesome idea and exiting exp result. Still, I am confused about the implement of COIL-full, when doing dense retrieval, can we do ANN search to speed up by using FAISS, or brute-force search indeed ? What's the implement in the paper experiment?