beir-cellar / beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
http://beir.ai
Apache License 2.0
1.57k stars 191 forks source link

Evaluation codes for ColBERT #67

Open rainatam opened 2 years ago

rainatam commented 2 years ago

Hi,

I want to evaluate ColBERT on BEIR but I couldn't find any example related to it.

You mentioned in the paper that it involves dense retrieval and re-ranking with ColBERT. Could you please share the code about that?

Thank you very much!

Xiao9905 commented 2 years ago

@NThakur20 Hi,

I am also looking for these evaluation code. It seems there are some complicated logics here. Could you please provide some hints on when will you release these evaluation code?

Many thanks!

thakur-nandan commented 2 years ago

Hi @rainatam and @Xiao9905, I worked on the evaluation scripts and released the ColBERT evaluation code here today: https://github.com/NThakur20/beir-ColBERT.

Please read the README attached to the repository for evaluation, it contains a single script evaluate_beir.sh which does all preprocessing, encoding, indexing, retrieval, and evaluation.

Hope it helps! Let me know in case something is broken or not working.

Kind Regards, Nandan Thakur

rainatam commented 2 years ago

Hi @NThakur20 ,

Thanks for your sharing!

The README and script are very clear and I can run the code successfully. Now I can reproduce your ColBERT results.

Btw, I noticed that the number of Partitions for IVFPQ index is varied for different datasets. Could you please share those parameters as well (if you kept them)?

Again, thank you for your work.

cadurosar commented 2 years ago

Hi @NThakur20 thanks for sharing the code it is really easy to test it.

However, there seems that there is a problem on evaluating Arguana on ColBERT (something that I had been having trouble as well, because I could never reproduce the results). The provided code does not remove the query from the document list, which makes it so that ColBERT has NDCG@1 = 0. Fixing it is easy, just update the reading part to:

    #### Results ####
    for _, row in tsv_reader(rankings):
        qid, doc_id, rank = row[0], row[1], int(row[2])
        if qid != inv_map[str(doc_id)]:
            if qid not in results:
                results[qid] = {inv_map[str(doc_id)]: 1 / (rank + 1)}
            else:
                results[qid][inv_map[str(doc_id)]] = 1 / (rank + 1)

This improves the result on arguana ndcg@10 from 0.2985 to 0.4042.

I don't remember if there are other datasets where this could be a problem

thakur-nandan commented 2 years ago

Thanks, @cadurosar for this! Yes, this problem would be seen for Arguana and Quora. Will update the necessary code!

yakkanti commented 1 year ago

@thakur-nandan , do you have a cpu version of the yml by any chance? or a googlel colab notebook?

zt991211 commented 1 year ago

@thakur-nandan I can not load the weights of model, do you have any solution?

RuntimeError: NCCL error in: /opt/conda/conda-bld/pytorch_1595629403081/work/torch/lib/c10d/ProcessGroupNCCL.cpp:32, unhandled cuda error, NCCL version 2.4.8

zt991211 commented 1 year ago

@rainatam @Xiao9905 @cadurosar Do you have encountered problems like this?

cadurosar commented 1 year ago

Not at the time, but it has been a while... Do you have more than one gpu on the machine? Maybe trying monogpu into that case? (CUDA_VISIBLE_DEVICES=0 ...)

zt991211 commented 1 year ago

Not at the time, but it has been a while... Do you have more than one gpu on the machine? Maybe trying monogpu into that case? (CUDA_VISIBLE_DEVICES=0 ...)

Thank you! The script uses torch.distributed.launch. If I use monogpu, does it still works? Should I drop the way of distributed?

zt991211 commented 1 year ago

ubuntu:107847:107847 [0] NCCL INFO Bootstrap : Using [0]ens31f0:192.168.2.107<0> ubuntu:107847:107847 [0] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so). ubuntu:107847:107847 [0] NCCL INFO NET/IB : No device found. ubuntu:107847:107847 [0] NCCL INFO NET/Socket : Using [0]ens31f0:192.168.2.107<0> NCCL version 2.4.8+cuda10.1 ubuntu:107848:107848 [0] NCCL INFO Bootstrap : Using [0]ens31f0:192.168.2.107<0> ubuntu:107848:107848 [0] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so). ubuntu:107848:107848 [0] NCCL INFO NET/IB : No device found. ubuntu:107848:107848 [0] NCCL INFO NET/Socket : Using [0]ens31f0:192.168.2.107<0> ubuntu:107847:107878 [0] NCCL INFO Setting affinity for GPU 0 to 0fffff,ff000000,0fffffff ubuntu:107848:107879 [0] NCCL INFO Setting affinity for GPU 0 to 0fffff,ff000000,0fffffff ubuntu:107847:107878 [0] NCCL INFO Channel 00 : 0 1 ubuntu:107847:107878 [0] NCCL INFO Ring 00 : 0[0] -> 1[0] via P2P/IPC ubuntu:107848:107879 [0] NCCL INFO Ring 00 : 1[0] -> 0[0] via P2P/IPC ubuntu:107847:107878 [0] NCCL INFO Using 256 threads, Min Comp Cap 8, Trees disabled ubuntu:107848:107879 [0] NCCL INFO comm 0x7fab54002730 rank 1 nranks 2 cudaDev 0 nvmlDev 0 - Init COMPLETE

ubuntu:107848:107848 [0] enqueue.cc:197 NCCL WARN Cuda failure 'invalid device function' ubuntu:107848:107848 [0] NCCL INFO misc/group.cc:148 -> 1 ubuntu:107847:107878 [0] NCCL INFO comm 0x7f2b18002730 rank 0 nranks 2 cudaDev 0 nvmlDev 0 - Init COMPLETE ubuntu:107847:107847 [0] NCCL INFO Launch mode Parallel

ubuntu:107847:107847 [0] enqueue.cc:197 NCCL WARN Cuda failure 'invalid device function' ubuntu:107847:107847 [0] NCCL INFO misc/group.cc:148 -> 1 Traceback (most recent call last): File "/data/huangchen/anaconda3/envs/colbert-v0.2/lib/python3.7/runpy.py", line 193, in _run_module_as_main Traceback (most recent call last): File "/data/huangchen/anaconda3/envs/colbert-v0.2/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/data/huangchen/anaconda3/envs/colbert-v0.2/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/huangchen/beir-ColBERT/colbert/index.py", line 58, in main() File "/home/huangchen/beir-ColBERT/colbert/index.py", line 25, in main args = parser.parse() File "/home/huangchen/beir-ColBERT/colbert/utils/parser.py", line 110, in parse Run.init(args.rank, args.root, args.experiment, args.run) File "/home/huangchen/beir-ColBERT/colbert/utils/runs.py", line 51, in init distributed.barrier(rank) File "/home/huangchen/beir-ColBERT/colbert/utils/distributed.py", line 25, in barrier torch.distributed.barrier() File "/data/huangchen/anaconda3/envs/colbert-v0.2/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 1710, in barrier "main", mod_spec) File "/data/huangchen/anaconda3/envs/colbert-v0.2/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/huangchen/beir-ColBERT/colbert/index.py", line 58, in main() File "/home/huangchen/beir-ColBERT/colbert/index.py", line 25, in main args = parser.parse()
work = _default_pg.barrier() File "/home/huangchen/beir-ColBERT/colbert/utils/parser.py", line 110, in parse RuntimeError: NCCL error in: /opt/conda/conda-bld/pytorch_1595629403081/work/torch/lib/c10d/ProcessGroupNCCL.cpp:32, unhandled cuda error, NCCL version 2.4.8 Run.init(args.rank, args.root, args.experiment, args.run) File "/home/huangchen/beir-ColBERT/colbert/utils/runs.py", line 51, in init distributed.barrier(rank) File "/home/huangchen/beir-ColBERT/colbert/utils/distributed.py", line 25, in barrier torch.distributed.barrier() File "/data/huangchen/anaconda3/envs/colbert-v0.2/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 1710, in barrier work = _default_pg.barrier() RuntimeError: NCCL error in: /opt/conda/conda-bld/pytorch_1595629403081/work/torch/lib/c10d/ProcessGroupNCCL.cpp:32, unhandled cuda error, NCCL version 2.4.8 Traceback (most recent call last): File "/data/huangchen/anaconda3/envs/colbert-v0.2/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/data/huangchen/anaconda3/envs/colbert-v0.2/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/data/huangchen/anaconda3/envs/colbert-v0.2/lib/python3.7/site-packages/torch/distributed/launch.py", line 261, in main() File "/data/huangchen/anaconda3/envs/colbert-v0.2/lib/python3.7/site-packages/torch/distributed/launch.py", line 257, in main cmd=cmd)