McGill-NLP / llm2vec

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
https://mcgill-nlp.github.io/llm2vec/
MIT License
1.15k stars 86 forks source link

BUG when testing results of MTEB retrieval #132

Open tianyumyum opened 1 month ago

tianyumyum commented 1 month ago

I was running mteb run -m McGill-NLP/LLM2Vec-Llama-2-7b-chat-hf-mntp-supervised -t DBPedia ArguAna NFCorpus FiQA2018 --output_folder results_0 --device 0

But I got: File "/mnt/ceph_home/zhongtianyun2023/miniconda3/envs/mteb_new/lib/python3.9/multiprocessing/pool.py", line 771, in get raise self._value AttributeError: 'LlamaBiModel' object has no attribute 'rotary_emb' Batches: 0%| | 0/1 [00:07<?, ?it/s] [W729 12:01:59.244093981 CudaIPCTypes.cpp:16] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]

vaibhavad commented 1 month ago

Hi @tianyumyum,

what version of transformers are you using? llm2vec supports transformers>=4.39.1,<=4.40.2

lst627 commented 1 month ago

I encountered the same issue yesterday while using this model. It started working after I downgraded the transformers to version 4.40.2.

tianyumyum commented 1 month ago

Thanks! I can run it now~@vaibhavad @lst627

tianyumyum commented 1 month ago

@vaibhavad Hello! I have created a new conda environment with all the packages as in this issue. image I didn't get an error when using the command line, but I got an error Bus error (core dumped) when using python files. I just ran the same code as you listed above in this site

import mteb

if __name__ == "__main__":
    model_name = "McGill-NLP/LLM2Vec-Sheared-LLaMA-mntp-supervised"
    model = mteb.get_model(model_name)
    tasks = mteb.get_tasks(tasks=["AmazonCounterfactualClassification"])
    evaluation = mteb.MTEB(tasks=tasks)
    results = evaluation.run(model, output_folder="results")

The bug will happen when running experiments/mteb_eval.py too.