luyug / COIL

NAACL2021 - COIL Contextualized Lexical Retriever
Apache License 2.0
142 stars 27 forks source link

Question about COIL-full #12

Open kinglai opened 2 years ago

kinglai commented 2 years ago

Awesome idea and exiting exp result. Still, I am confused about the implement of COIL-full, when doing dense retrieval, can we do ANN search to speed up by using FAISS, or brute-force search indeed ? What's the implement in the paper experiment?

kinglai commented 2 years ago

It seems you're doing brute-force search.

doc_cls_reps = torch.load(os.path.join(args.doc_shard, 'cls_reps.pt')).float()

......

for batch_start in trange(0, len(all_query_offsets), batch_size, desc=shard_name):
        batch_q_reps = query_cls_reps[batch_start: batch_start + batch_size]
        match_scores = torch.matmul(batch_q_reps, doc_cls_reps.transpose(0, 1))  # D * b

        batched_qtok_offsets = defaultdict(list)