beir-cellar / beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
http://beir.ai
Apache License 2.0
1.49k stars 177 forks source link

example file, train_msmarco_v2.py is not working #163

Open bakingeol opened 5 months ago

bakingeol commented 5 months ago

Hi, thank you for great repo

I'm trying to training dense retrieve model, using example code(train_msmarco_v2.py), I encounter something strange situation.

retrieve.fit method is not working, the progress bar and GPU-util are stucked. only model parameters are uploaded, forward and backward pass is not working.

I created a new conda and installed beir, and the python version is 3.7.16. The only thing that changed was that triplets could not be downloaded, so I went to the link, downloaded it manually, and added the jsonl file.

During training, amp_use was changed to True when tried with A6000, and changed to False when tried with rtx4090.

Can you give me some reasons why this example file is not working??

Thank you.

(dense_retr) baekig@rtx02:~/practice/beir_practice/beir/examples/retrieval/training$ python --version Python 3.7.16 (dense_retr) baekig@rtx02:~/practice/beir_practice/beir/examples/retrieval/training$ python train_msmarco_v2.py 2024-02-13 22:05:39 - Loading Corpus... 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8841823/8841823 [00:32<00:00, 274786.45it/s] 2024-02-13 22:06:12 - Loaded 8841823 DEV Documents. 2024-02-13 22:06:13 - Doc Example: {'text': 'The presence of communication amid scientific minds was equally important to the success of the Manhattan Project as scientific intellect was. The only cloud hanging over the impressive achievement of the atomic researchers and engineers is what their success truly meant; hundreds of thousands of innocent lives obliterated.', 'title': ''} 2024-02-13 22:06:13 - Loading Queries... 2024-02-13 22:06:14 - Loaded 6980 DEV Queries. 2024-02-13 22:06:14 - Query Example: how many years did william bradford serve as governor of plymouth colony? 2024-02-13 22:06:14 - loading triplets dataset 9144553it [00:57, 160027.21it/s] 2024-02-13 22:07:11 - model load Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertModel: ['vocab_transform.weight', 'vocab_transform.bias', 'vocab_projector.bias', 'vocab_layer_norm.weight', 'vocab_layer_norm.bias']

image