Georgepitt commented 6 months ago

Hello, the computing cluster provided by the lab needs to run offline. But the code in usage needs to be networked. I have changed the code to the offline version, but it still gives errors. Can you give me some help, please?

usage code:

Loading base Mistral model, along with custom code that enables bidirectional connections in decoder-only LLMs.

tokenizer = AutoTokenizer.from_pretrained( "McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp" ) config = AutoConfig.from_pretrained( "McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp", trust_remote_code=True ) model = AutoModel.from_pretrained( "McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp", trust_remote_code=True, config=config, torch_dtype=torch.bfloat16, device_map="cuda" if torch.cuda.is_available() else "cpu", )

Loading MNTP (Masked Next Token Prediction) model.

model = PeftModel.from_pretrained( model, "McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp", )

Modified code： local_base_model_path = "/home/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp" tokenizer = AutoTokenizer.from_pretrained(local_base_model_path) config = AutoConfig.from_pretrained(local_base_model_path) model = AutoModel.from_pretrained(local_base_model_path, config=config,torch_dtype=torch.bfloat16, local_files_only=True) print(4)

errors: Traceback (most recent call last): File "/share/home/chenyuxuan/Llama3_8b_s.py", line 59, in model = AutoModel.from_pretrained(local_model_path, config=config,torch_dtype=torch.bfloat16, local_files_only=True) File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained return model_class.from_pretrained( File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3385, in from_pretrained if has_file(pretrained_model_name_or_path, TF2_WEIGHTS_NAME, has_file_kwargs): File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/transformers/utils/hub.py", line 627, in has_file r = requests.head(url, headers=headers, allow_redirects=False, proxies=proxies, timeout=10) File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/requests/api.py", line 100, in head return request("head", url, kwargs) File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/requests/api.py", line 59, in request return session.request(method=method, url=url, kwargs) File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/requests/sessions.py", line 589, in request resp = self.send(prep, send_kwargs) File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/requests/sessions.py", line 703, in send r = adapter.send(request, **kwargs) File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/requests/adapters.py", line 532, in send raise ReadTimeout(e, request=request) requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)

vaibhavad commented 6 months ago

Hi @Georgepitt, thanks for your interest in our work.

We also faced this issue during our model development, PEFT library pushed a fix 2 months ago so the latest version should support offline loading.

Here is what I have tried on my end

First, download the models with internet connection


from huggingface_hub import snapshot_download
snapshot_download(repo_id="McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp")
# this command will return local address, lets call it <MNTP_LOCAL_PATH>

snapshot_download(repo_id="McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-unsup-simcse")

this command will return local address, lets call it


2. In a separate session without the internet, start python interactive/script with `HF_HUB_OFFLINE=1` option
```bash
HF_HUB_OFFLINE=1 python

In this offline python session, this script should work
```
import torch
from llm2vec import LLM2Vec
```

l2v = LLM2Vec.from_pretrained( "", peft_model_name_or_path="", device_map="cuda" if torch.cuda.is_available() else "cpu", torch_dtype=torch.bfloat16, )


Here are details of relevant library versions in my environment
```bash
huggingface-hub          0.22.2
peft                     0.10.0
transformers             4.40.1

Let me know if you have any further questions.

SouLeo commented 6 months ago

Hello! I'm in a similar boat. I tried running your script, but for the llama-3-8b model and am having issues as well.

I run the following (without internet connection):

import torch
from llm2vec import LLM2Vec

# https://github.com/McGill-NLP/llm2vec/issues/52
l2v = LLM2Vec.from_pretrained(
    "<LOCAL PATH to McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp>",
    peft_model_name_or_path="<LOCAL PATH to https://huggingface.co/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-unsup-simcse>",
    device_map="cuda" if torch.cuda.is_available() else "cpu",
    torch_dtype=torch.bfloat16,
)

However, I still get the following:

huggingface_hub.errors.OfflineModeIsEnabled: Cannot reach https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/resolve/main/config.json: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.

I understand that I still need the underlying llama-3 model from Meta (I do have access to it), but I don't know how to link to where I have that model stored locally. Is there a simple fix?

Thank you!!

Georgepitt commented 6 months ago

Thank you for your advice！@vaibhavad. I've followed the advice, but it still doesn't work.I don't know what went wrong. Could you give me some advice? Thank you！

download the models: python model_download.py Fetching 11 files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 153791.15it/s] MNTP model path: /home/.cache/huggingface/hub/models--McGill-NLP--LLM2Vec-Mistral-7B-Instruct-v2-mntp/snapshots/5ec8e6444af63627e7609f38641de612c6de0105 Fetching 4 files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 53430.62it/s] SimCSE model path: /home/.cache/huggingface/hub/models--McGill-NLP--LLM2Vec-Mistral-7B-Instruct-v2-mntp-unsup-simcse/snapshots/2c055a5d77126c0d3dc6cd8ffa30e2908f4f45f8

running script: HF_HUB_OFFLINE=1 ~/.conda/envs/LLM2Vec/bin/python /share/home/Mistral.py

and than it return this errors： OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like mistralai/Mistral-7B-Instruct-v0.2 is not the path to a directory containing a file named config.json. Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

vaibhavad commented 6 months ago

I believe you also need to download mistralai/Mistral-7B-Instruct-v0.2. You can run this command with internet connection

HF_HOME=<CACHE_DIR> python -c "import transformers; transformers.AutoModel.from_pretrained('mistralai/Mistral-7B-Instruct-v0.2')"

Make sure to specify CACHE_DIR as a directory which is accessible to you in the offline mode

After this if you launch python with

HF_HOME=<CACHE_DIR> HF_HUB_OFFLINE=1 python

then all model loading should work as expected.

Let me know iff these steps fix your issue

Georgepitt commented 6 months ago

Thank you very much for your help@vaibhavad! I successfully run the project locally. In fact, the really important Settings here are in the adapter_config.json file, which contains the two fields "base_model_name_or_path" and "parent_library", if you map them locally, then the model will run.

This is my setup： import os os.environ["HF_HUB_OFFLINE"] = "1" os.environ["HF_HOME"] = "" os.environ["TOKENIZERS_PARALLELISM"] = "false"

One unusual thing is that I can only specify one gpu to run, otherwise it will load the model repeatedly in the l2v.encode section.

vaibhavad commented 6 months ago

@Georgepitt, glad to know the issue is resolved.

One unusual thing is that I can only specify one gpu to run, otherwise it will load the model repeatedly in the l2v.encode section.

I did not fully understand this. Can you provide more details? By default, encode tries to use all the GPUs available.

SouLeo commented 6 months ago

It's possible that I have the same issue. In order to run your code on my system, I have to comment out lines 341-361 in llm2vec.py. If I don't comment out lines 341-361, the following output occurs.

Also, I know this output is information overload. If you could direct me to some specific outputs / logs you need to better understand the issue, I will follow up with those details.

Nvidia-SMI for CUDA:0 Device (all other devices on my machine are still unused):

Mon May 13 09:35:33 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA RTX A5000               Off | 00000000:1B:00.0 Off |                  Off |
| 30%   28C    P8              17W / 230W |      4MiB / 24564MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

Mon May 13 09:35:36 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA RTX A5000               Off | 00000000:1B:00.0 Off |                  Off |
| 30%   30C    P2              59W / 230W |   2043MiB / 24564MiB |     31%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

Mon May 13 09:35:37 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA RTX A5000               Off | 00000000:1B:00.0 Off |                  Off |
| 30%   30C    P2              59W / 230W |   4675MiB / 24564MiB |     20%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

Mon May 13 09:35:39 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA RTX A5000               Off | 00000000:1B:00.0 Off |                  Off |
| 30%   30C    P2              58W / 230W |   6131MiB / 24564MiB |     17%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

Mon May 13 09:35:41 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA RTX A5000               Off | 00000000:1B:00.0 Off |                  Off |
| 30%   30C    P2              59W / 230W |   9767MiB / 24564MiB |     20%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

Mon May 13 09:35:43 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA RTX A5000               Off | 00000000:1B:00.0 Off |                  Off |
| 30%   31C    P2              59W / 230W |  11591MiB / 24564MiB |     21%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

Mon May 13 09:35:45 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA RTX A5000               Off | 00000000:1B:00.0 Off |                  Off |
| 30%   31C    P2              59W / 230W |  13915MiB / 24564MiB |     21%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

Mon May 13 09:35:46 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA RTX A5000               Off | 00000000:1B:00.0 Off |                  Off |
| 30%   31C    P2              59W / 230W |  15147MiB / 24564MiB |     21%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

Mon May 13 09:35:51 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA RTX A5000               Off | 00000000:1B:00.0 Off |                  Off |
| 30%   33C    P2              78W / 230W |  16137MiB / 24564MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

Mon May 13 09:36:06 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA RTX A5000               Off | 00000000:1B:00.0 Off |                  Off |
| 30%   35C    P2              81W / 230W |  21667MiB / 24564MiB |     18%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

Terminal outputs:

(test_mteb) slwanna@lepp:~/mteb$ python other_test.py 
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 10.96it/s]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:13<00:00,  3.26s/it]
Some weights of the model checkpoint at meta-llama/Meta-Llama-3-8B-Instruct were not used when initializing LlamaEncoderModel: ['lm_head.weight']
- This IS expected if you are initializing LlamaEncoderModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing LlamaEncoderModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 32513.98it/s]
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.92it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 12.55it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 12.25it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00,  9.58it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 10.83it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.80it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.81it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 12.89it/s]
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:08<?, ?it/s]
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
Traceback (most recent call last):
  File "<string>", line 1, in <module>
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    ) = cls._load_pretrained_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB (GPU 0; 23.68 GiB total capacity; 4.56 GiB already allocated; 128.00 KiB free; 4.61 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:01<?, ?it/s]  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main

    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
Traceback (most recent call last):
  File "<string>", line 1, in <module>
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    return _run_module_code(code, init_globals, run_name,
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:01<?, ?it/s]  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code

    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
Traceback (most recent call last):
    main_content = runpy.run_path(main_path,
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    ) = cls._load_pretrained_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:01<?, ?it/s]
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
Traceback (most recent call last):
  File "<string>", line 1, in <module>
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    new_value = value.to(device)
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    ) = cls._load_pretrained_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.77it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 12.19it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.69it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.50it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 12.21it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.26it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.13it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00,  2.78it/s]
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:07<?, ?it/s]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB (GPU 0; 23.68 GiB total capacity; 4.19 GiB already allocated; 24.12 MiB free; 4.23 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:01<?, ?it/s]    exec(code, run_globals)

  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    ) = cls._load_pretrained_model(
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    ) = cls._load_pretrained_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:01<?, ?it/s]
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:01<?, ?it/s]
    ) = cls._load_pretrained_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:01<?, ?it/s]
    ) = cls._load_pretrained_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    new_value = value.to(device)
    return _run_module_code(code, init_globals, run_name,
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code

    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:02<?, ?it/s]
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.64it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.35it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 10.05it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.78it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 10.89it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 12.42it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.47it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.78it/s]
^CTraceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in __init__
    [ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in <listcomp>
    [ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 47, in __init__
    self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 17, in __init__
    super().__init__(*args, **kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 291, in __init__
    self._init_rope()
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 295, in _init_rope
    self.rotary_emb = LlamaRotaryEmbedding(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 112, in __init__
    self.register_buffer("_cos_cached", emb.cos().to(torch.get_default_dtype()), persistent=False)
KeyboardInterrupt
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in __init__
    [ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in <listcomp>
    [ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 47, in __init__
    self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 17, in __init__
    super().__init__(*args, **kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 291, in __init__
    self._init_rope()
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 295, in _init_rope
    self.rotary_emb = LlamaRotaryEmbedding(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 113, in __init__
    self.register_buffer("_sin_cached", emb.sin().to(torch.get_default_dtype()), persistent=False)
KeyboardInterrupt
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
Traceback (most recent call last):
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in __init__
    [ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in <listcomp>
    [ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 47, in __init__
    self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 17, in __init__
    super().__init__(*args, **kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 291, in __init__
    self._init_rope()
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 295, in _init_rope
    self.rotary_emb = LlamaRotaryEmbedding(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 112, in __init__
    self.register_buffer("_cos_cached", emb.cos().to(torch.get_default_dtype()), persistent=False)
KeyboardInterrupt
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in __init__
    [ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in <listcomp>
    [ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 47, in __init__
    self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 17, in __init__
    super().__init__(*args, **kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 291, in __init__
    self._init_rope()
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 295, in _init_rope
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    self.rotary_emb = LlamaRotaryEmbedding(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 113, in __init__
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    self.register_buffer("_sin_cached", emb.sin().to(torch.get_default_dtype()), persistent=False)
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
KeyboardInterrupt
    _fixup_main_from_path(data['init_main_from_path'])
Traceback (most recent call last):
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
Traceback (most recent call last):
    exitcode = _main(fd, parent_sentinel)
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in __init__
    [ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in <listcomp>
    [ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 47, in __init__
    self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 17, in __init__
    super().__init__(*args, **kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 291, in __init__
    self._init_rope()
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 295, in _init_rope
    model = cls(config, *model_args, **model_kwargs)
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in __init__
    self.rotary_emb = LlamaRotaryEmbedding(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 112, in __init__
    [ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in <listcomp>
    [ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 47, in __init__
    self.register_buffer("_cos_cached", emb.cos().to(torch.get_default_dtype()), persistent=False)
KeyboardInterrupt
    self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 17, in __init__
    super().__init__(*args, **kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 291, in __init__
    self._init_rope()
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 295, in _init_rope
    self.rotary_emb = LlamaRotaryEmbedding(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 111, in __init__
    emb = torch.cat((freqs, freqs), dim=-1)
KeyboardInterrupt
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in __init__
    [ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in <listcomp>
    [ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 47, in __init__
    self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
  File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 17, in __init__
    super().__init__(*args, **kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 291, in __init__
    self._init_rope()
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 295, in _init_rope
    self.rotary_emb = LlamaRotaryEmbedding(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 113, in __init__
    self.register_buffer("_sin_cached", emb.sin().to(torch.get_default_dtype()), persistent=False)
KeyboardInterrupt
Segmentation fault (core dumped)
Loading checkpoint shards:   0%|                                                                                                      | 0/4 [00:02<?, ?it/s]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/slwanna/mteb/other_test.py", line 18, in <module>
    model = AutoModel.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
KeyboardInterrupt
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 6 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
(test_mteb) slwanna@lepp:~/mteb$

vaibhavad commented 6 months ago

Hi @SouLeo, can you share your other_test.py file? I will try to run it on my end. Also, how many GPUs are on your node when you run it and what is the CPU RAM?

As a reference, our multi-GPU encoding implementation is similar to sentence-transformers library implementation

SouLeo commented 6 months ago

Sure thing!

Here is other_test.py

from llm2vec import LLM2Vec

import torch
from transformers import AutoTokenizer, AutoModel, AutoConfig
from peft import PeftModel
from mteb import MTEB

MODEL_NAME = "McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp"

# Loading base Mistral model, along with custom code that enables bidirectional connections in decoder-only LLMs. MNTP LoRA weights are merged into the base model.
tokenizer = AutoTokenizer.from_pretrained(
    MODEL_NAME
)
config = AutoConfig.from_pretrained(
    MODEL_NAME, trust_remote_code=True
)
model = AutoModel.from_pretrained(
    MODEL_NAME,
    trust_remote_code=True,
    config=config,
    torch_dtype=torch.float16,
    device_map="cuda" if torch.cuda.is_available() else "cpu",
)
model = PeftModel.from_pretrained(
    model,
    MODEL_NAME,
)
model = model.merge_and_unload()  # This can take several minutes on cpu

# Loading unsupervised SimCSE model. This loads the trained LoRA weights on top of MNTP model. Hence the final weights are -- Base model + MNTP (LoRA) + SimCSE (LoRA).
model = PeftModel.from_pretrained(
    model, "McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervised"
)

# Wrapper for encoding and pooling operations
l2v = LLM2Vec(model, tokenizer, pooling_mode="mean", max_length=512)

# model_name = "llama3"

# evaluation = MTEB(tasks=["Banking77Classification"])
# results = evaluation.run(l2v, output_folder=f"results/{model_name}")

# Encoding queries using instructions
instruction = (
    "Given a web search query, retrieve relevant passages that answer the query:"
)
queries = [
    [instruction, "how much protein should a female eat"],
    [instruction, "summit define"],
]
q_reps = l2v.encode(queries)

# Encoding documents. Instruction are not required for documents
documents = [
    "As a general guideline, the CDC's average requirement of protein for women ages 19 to 70 is 46 grams per day. But, as you can see from this chart, you'll need to increase that if you're expecting or training for a marathon. Check out the chart below to see how much protein you should be eating each day.",
    "Definition of summit for English Language Learners. : 1  the highest point of a mountain : the top of a mountain. : 2  the highest level. : 3  a meeting or series of meetings between the leaders of two or more governments.",
]
d_reps = l2v.encode(documents)

# Compute cosine similarity
q_reps_norm = torch.nn.functional.normalize(q_reps, p=2, dim=1)
d_reps_norm = torch.nn.functional.normalize(d_reps, p=2, dim=1)
cos_sim = torch.mm(q_reps_norm, d_reps_norm.transpose(0, 1))

print(cos_sim)
"""
tensor([[0.6470, 0.1619],
        [0.0786, 0.5844]])
"""

CPU RAM:

(base) slwanna@lepp:~$ free -g
               total        used        free      shared  buff/cache   available
Mem:            1510           9        1338           0         163        1493
Swap:              1           1           0

I have 8 NVIDIA RTX A5000 on my node.

I will also look into the implementation you linked.

SouLeo commented 6 months ago

Hi all, I have moved my code to an internet-connected, server with 8xH100's. I'm having similar issues with your multigpu .encode() function. See below.

I'm still investigating this and don't want to see this issue get stale. But, I just wanted to double check that you had tested your encode function on multigpu systems.

I have run the following sentence transformers model as a test and had no issues: $ python test_sentence_transformers.py

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("hkunlp/instructor-xl")

# Our sentences we like to encode
sentences = [
    "This framework generates embeddings for each input sentence",
    "Sentences are passed as a list of string.",
    "The quick brown fox jumps over the lazy dog.",
]

# Sentences are encoded by calling model.encode()
sentence_embeddings = model.encode(sentences)

# Print the embeddings
for sentence, embedding in zip(sentences, sentence_embeddings):
    print("Sentence:", sentence)
    print("Embedding:", embedding)
    print("")

However, when I run $ python retrieval_scidocs.py

import datasets
import torch
from llm2vec import LLM2Vec
# from beir import util
# from beir.datasets.data_loader import GenericDataLoader as BeirDataLoader
import os 
from typing import Dict, List

# from beir.retrieval.evaluation import EvaluateRetrieval

dataset_name = "mteb/scidocs"
instruction = "Given a scientific paper title, retrieve paper abstracts that are cited by the given paper: "

print("Loading dataset...")
queries = datasets.load_dataset(dataset_name, "queries")
corpus = datasets.load_dataset(dataset_name, "corpus")

batch_size = 2

print("Loading model...")
model = LLM2Vec.from_pretrained(
    "McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp",
    peft_model_name_or_path="McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-supervised",
    device_map="cuda" if torch.cuda.is_available() else "cpu",
    attn_implementation="flash_attention_2",
    torch_dtype=torch.bfloat16,
)

def append_instruction(instruction, sentences):
    new_sentences = []
    for s in sentences:
        new_sentences.append([instruction, s, 0])
    return new_sentences

def cos_sim(a: torch.Tensor, b: torch.Tensor):
    if not isinstance(a, torch.Tensor):
        a = torch.tensor(a)

    if not isinstance(b, torch.Tensor):
        b = torch.tensor(b)

    if len(a.shape) == 1:
        a = a.unsqueeze(0)

    if len(b.shape) == 1:
        b = b.unsqueeze(0)

    a_norm = torch.nn.functional.normalize(a, p=2, dim=1)
    b_norm = torch.nn.functional.normalize(b, p=2, dim=1)
    return torch.mm(a_norm, b_norm.transpose(0, 1))

def encode_queries(queries: List[str], batch_size: int, **kwargs):
    new_sentences = append_instruction(instruction, queries)

    kwargs['show_progress_bar'] = False
    return model.encode(new_sentences, batch_size=batch_size, **kwargs)

def encode_corpus(corpus: List[Dict[str, str]], batch_size: int, **kwargs):
    if type(corpus) is dict:
        sentences = [
            (corpus["title"][i] + ' ' + corpus["text"][i]).strip()
            if "title" in corpus
            else corpus["text"][i].strip()
            for i in range(len(corpus["text"]))
        ]
    else:
        sentences = [
            (doc["title"] + ' ' + doc["text"]).strip() if "title" in doc else doc["text"].strip()
            for doc in corpus
        ]
    new_sentences = append_instruction("", sentences)
    return model.encode(new_sentences, batch_size=batch_size, **kwargs)

print("Encoding Queries...")
query_ids = list(queries.keys())
results = {qid: {} for qid in query_ids}
queries = [queries[qid] for qid in queries]
query_embeddings = encode_queries(queries[0]['text'][:2], batch_size=batch_size, show_progress_bar=True, convert_to_tensor=True)

I again get errors:

Loading dataset...
Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 12.28it/s]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:02<00:00,  1.05it/s]
Encoding Queries...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 53092.46it/s]
Loading dataset...
Loading dataset...
Loading dataset...
Loading dataset...
Loading dataset...
Loading dataset...
Loading dataset...
Loading dataset...
Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Downloading shards:   0%|                                                                                                         | 0/3 [00:00<?, ?it/s]Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 12.55it/s]
Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Downloading shards:  67%|████████████████████████████████████████████████████████████████▋                                | 2/3 [00:00<00:00, 11.49it/s]Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 12.05it/s]
Downloading shards:   0%|                                                                                                         | 0/3 [00:00<?, ?it/s]Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 11.85it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 11.72it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 12.25it/s]
Downloading shards:   0%|                                                                                                         | 0/3 [00:00<?, ?it/s]Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 11.83it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 12.95it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00,  6.08it/s]
Loading checkpoint shards:  33%|██████████████████████████████                                                            | 1/3 [00:02<00:05,  2.66s/it]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 21, in <module>
    model = LLM2Vec.from_pretrained(
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 96, in from_pretrained
    model = model_class.from_pretrained(base_model_name_or_path, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    ) = cls._load_pretrained_model(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
                ^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU 
Loading checkpoint shards:  33%|██████████████████████████████                                                            | 1/3 [00:02<00:05,  2.79s/it]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 21, in <module>
    model = LLM2Vec.from_pretrained(
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 96, in from_pretrained
    model = model_class.from_pretrained(base_model_name_or_path, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    ) = cls._load_pretrained_model(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
                ^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU 
Loading checkpoint shards:  67%|████████████████████████████████████████████████████████████                              | 2/3 [00:03<00:01,  1.64s/it]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 21, in <module>
    model = LLM2Vec.from_pretrained(
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 96, in from_pretrained
    model = model_class.from_pretrained(base_model_name_or_path, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
Loading checkpoint shards:  33%|██████████████████████████████                                                            | 1/3 [00:02<00:05,  2.96s/it]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
    ) = cls._load_pretrained_model(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 21, in <module>
    model = LLM2Vec.from_pretrained(
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 96, in from_pretrained
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
    model = model_class.from_pretrained(base_model_name_or_path, **kwargs)
                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
                ^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU 
Loading checkpoint shards:  33%|██████████████████████████████                                                            | 1/3 [00:02<00:04,  2.50s/it]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
    ) = cls._load_pretrained_model(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 21, in <module>
    model = LLM2Vec.from_pretrained(
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 96, in from_pretrained
    model = model_class.from_pretrained(base_model_name_or_path, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
                ^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU 
    ) = cls._load_pretrained_model(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
                ^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU 
Loading checkpoint shards:  33%|██████████████████████████████                                                            | 1/3 [00:02<00:04,  2.26s/it]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 21, in <module>
    model = LLM2Vec.from_pretrained(
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 96, in from_pretrained
    model = model_class.from_pretrained(base_model_name_or_path, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    ) = cls._load_pretrained_model(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
                ^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU 
Loading checkpoint shards:  33%|██████████████████████████████                                                            | 1/3 [00:01<00:03,  1.78s/it]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 21, in <module>
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:03<00:00,  1.21s/it]    model = LLM2Vec.from_pretrained(
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 96, in from_pretrained
    model = model_class.from_pretrained(base_model_name_or_path, **kwargs)
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:03<00:00,  1.30s/it]
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    ) = cls._load_pretrained_model(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
    new_value = value.to(device)
                ^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU 
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 21, in <module>
    model = LLM2Vec.from_pretrained(
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 96, in from_pretrained
    model = model_class.from_pretrained(base_model_name_or_path, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3735, in from_pretrained
    dispatch_model(model, **device_map_kwargs)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/big_modeling.py", line 488, in dispatch_model
    model.to(device)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 2692, in to
    return super().to(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1173, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/nn/modules/module.py", line 779, in _apply
    module._apply(fn)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/nn/modules/module.py", line 779, in _apply
    module._apply(fn)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/nn/modules/module.py", line 779, in _apply
    module._apply(fn)
  [Previous line repeated 1 more time]
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/nn/modules/module.py", line 853, in _apply
    self._buffers[key] = fn(buf)
                         ^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1159, in convert
    return t.to(
           ^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 
^CTraceback (most recent call last):
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
Traceback (most recent call last):
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
    exitcode = _main(fd, parent_sentinel)
    exitcode = _main(fd, parent_sentinel)
    prepare(preparation_data)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
Traceback (most recent call last):
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
    prepare(preparation_data)
  File "<string>", line 1, in <module>
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
    prepare(preparation_data)
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
    _fixup_main_from_path(data['init_main_from_path'])
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 3, in <module>
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
    prepare(preparation_data)
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 2, in <module>
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
    from llm2vec import LLM2Vec
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/__init__.py", line 1, in <module>
  File "<frozen runpy>", line 88, in _run_code
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 3, in <module>
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
    _fixup_main_from_path(data['init_main_from_path'])
  File "<frozen runpy>", line 291, in run_path
    import torch
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 3, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/__init__.py", line 1919, in <module>
    from llm2vec import LLM2Vec
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/__init__.py", line 1, in <module>
    from .llm2vec import LLM2Vec
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 10, in <module>
    from llm2vec import LLM2Vec
    main_content = runpy.run_path(main_path,
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/__init__.py", line 1, in <module>
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 3, in <module>
    from peft import PeftModel
    from .llm2vec import LLM2Vec
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/__init__.py", line 22, in <module>
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 10, in <module>
    from .llm2vec import LLM2Vec
    from llm2vec import LLM2Vec
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 10, in <module>
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/__init__.py", line 1, in <module>
    from .auto import (
    from peft import PeftModel
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/auto.py", line 21, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/__init__.py", line 22, in <module>
    from peft import PeftModel
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/__init__.py", line 22, in <module>
    from .auto import (
    from transformers import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/auto.py", line 31, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/__init__.py", line 26, in <module>
    from .llm2vec import LLM2Vec
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 10, in <module>
    from . import _meta_registrations
    from .auto import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/auto.py", line 31, in <module>
    from .config import PeftConfig
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_meta_registrations.py", line 9, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/config.py", line 23, in <module>
    from peft import PeftModel
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/__init__.py", line 22, in <module>
    from .config import PeftConfig
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/config.py", line 23, in <module>
    from torch._decomp import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_decomp/__init__.py", line 244, in <module>
    from .utils import CONFIG_NAME, PeftType, TaskType
    from .auto import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/auto.py", line 31, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/__init__.py", line 23, in <module>
    from .utils import CONFIG_NAME, PeftType, TaskType
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/__init__.py", line 23, in <module>
    from .config import PeftConfig
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/config.py", line 23, in <module>
    from .other import (
    from .other import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/other.py", line 21, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/other.py", line 21, in <module>
    import torch._decomp.decompositions
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_decomp/decompositions.py", line 11, in <module>
    from .utils import CONFIG_NAME, PeftType, TaskType
    from . import dependency_versions_check
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/__init__.py", line 23, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/dependency_versions_check.py", line 16, in <module>
    from .other import (
    from .utils.versions import require_version, require_version_core
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/other.py", line 21, in <module>
    import accelerate
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/utils/__init__.py", line 33, in <module>
    import accelerate
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/__init__.py", line 16, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/__init__.py", line 16, in <module>
    import accelerate
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/__init__.py", line 16, in <module>
    from .accelerator import Accelerator
    from .accelerator import Accelerator
    from .generic import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/accelerator.py", line 35, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/accelerator.py", line 35, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/utils/generic.py", line 31, in <module>
    from .accelerator import Accelerator
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/accelerator.py", line 35, in <module>
    from .import_utils import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 95, in <module>
    import torch._prims as prims
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_prims/__init__.py", line 3031, in <module>
    _accelerate_available, _accelerate_version = _is_package_available("accelerate", return_version=True)
                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 48, in _is_package_available
    from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
    from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
    from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/checkpointing.py", line 24, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/checkpointing.py", line 24, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/checkpointing.py", line 24, in <module>
    package_version = importlib.metadata.version(pkg_name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/importlib/metadata/__init__.py", line 1009, in version
    from .utils import (
    from .utils import (
    from .utils import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/__init__.py", line 192, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/__init__.py", line 182, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/__init__.py", line 182, in <module>
    from .megatron_lm import (
    register_debug_prims()
    from .fsdp_utils import load_fsdp_model, load_fsdp_optimizer, save_fsdp_model, save_fsdp_optimizer
    from .fsdp_utils import load_fsdp_model, load_fsdp_optimizer, save_fsdp_model, save_fsdp_optimizer
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/megatron_lm.py", line 32, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/fsdp_utils.py", line 26, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/fsdp_utils.py", line 26, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_prims/debug_prims.py", line 40, in register_debug_prims
    return distribution(distribution_name).version
    @load_tensor.impl_factory()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    import torch.distributed.checkpoint as dist_cp
     ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/importlib/metadata/__init__.py", line 632, in version
    import torch.distributed.checkpoint as dist_cp
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_custom_op/impl.py", line 333, in inner
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/checkpoint/__init__.py", line 11, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/checkpoint/__init__.py", line 2, in <module>
    from .optimizer import load_sharded_optimizer_state_dict
    from .default_planner import DefaultLoadPlanner, DefaultSavePlanner
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/checkpoint/optimizer.py", line 40, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/checkpoint/default_planner.py", line 13, in <module>
    self._register_impl("factory", f)
    from transformers.modeling_outputs import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_outputs.py", line 195, in <module>
    return self.metadata['Version']
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_custom_op/impl.py", line 223, in _register_impl
           ^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/importlib/metadata/__init__.py", line 617, in metadata
    frame = inspect.getframeinfo(sys._getframe(stacklevel))
    from torch.distributed._tensor import DTensor
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/__init__.py", line 6, in <module>
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/inspect.py", line 1688, in getframeinfo
    return _adapters.Message(email.message_from_string(text))
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/email/__init__.py", line 37, in message_from_string
    @dataclass
     ^^^^^^^^^
    import torch.distributed._tensor.ops
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/dataclasses.py", line 1232, in dataclass
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/ops/__init__.py", line 2, in <module>
    from .embedding_ops import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/ops/embedding_ops.py", line 8, in <module>
    import torch.distributed._functional_collectives as funcol
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_functional_collectives.py", line 12, in <module>
    from . import _functional_collectives_impl as fun_col_impl
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_functional_collectives_impl.py", line 36, in <module>
    from torch._dynamo import assume_constant_result
    return wrap(cls)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/__init__.py", line 2, in <module>
           ^^^^^^^^^
    from torch.distributed.fsdp._shard_utils import _create_chunk_sharded_tensor
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/dataclasses.py", line 1222, in wrap
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/fsdp/__init__.py", line 2, in <module>
    from . import convert_frame, eval_frame, resume_execution
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 31, in <module>
    lines, lnum = findsource(frame)
                  ^^^^^^^^^^^^^^^^^
    return _process_class(cls, init, repr, eq, order, unsafe_hash,
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/inspect.py", line 1071, in findsource
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    from torch.fx.experimental.symbolic_shapes import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/fx/experimental/symbolic_shapes.py", line 63, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/dataclasses.py", line 1056, in _process_class
    module = getmodule(object, file)
    return Parser(*args, **kws).parsestr(s)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/inspect.py", line 988, in getmodule
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/email/parser.py", line 67, in parsestr
    if ismodule(module) and hasattr(module, '__file__'):
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
    _cmp_fn('__eq__', '==',
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/dataclasses.py", line 630, in _cmp_fn
    return _create_fn(name,
           ^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/dataclasses.py", line 424, in _create_fn
    return self.parse(StringIO(text), headersonly=headersonly)
    from .fully_sharded_data_parallel import (
    args = ','.join(args)
           ^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
KeyboardInterrupt
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    from torch.utils._sympy.functions import FloorDiv, Mod, IsNonOverlappingAndDenseIndicator
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 936, in exec_module
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/email/parser.py", line 56, in parse
  File "<frozen importlib._bootstrap_external>", line 1069, in get_code
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/utils/_sympy/functions.py", line 269, in <module>
  File "<frozen importlib._bootstrap_external>", line 729, in _compile_bytecode
KeyboardInterrupt
    class RShift(sympy.Function):
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/sympy/core/basic.py", line 121, in __init_subclass__
    feedparser.feed(data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/email/feedparser.py", line 174, in feed
    self._call_parse()
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/email/feedparser.py", line 178, in _call_parse
    self._parse()
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/email/feedparser.py", line 238, in _parsegen
    self._parse_headers(headers)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/email/feedparser.py", line 486, in _parse_headers
    self._cur.set_raw(*self.policy.header_source_parse(lastvalue))
KeyboardInterrupt
    _prepare_class_assumptions(cls)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/sympy/core/assumptions.py", line 642, in _prepare_class_assumptions
    eval_is_meth = getattr(cls, '_eval_is_%s' % k, None)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 3, in <module>
    from llm2vec import LLM2Vec
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/__init__.py", line 1, in <module>
    from .llm2vec import LLM2Vec
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 10, in <module>
    from peft import PeftModel
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/__init__.py", line 22, in <module>
    from .auto import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/auto.py", line 31, in <module>
    from .config import PeftConfig
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/config.py", line 23, in <module>
    from .utils import CONFIG_NAME, PeftType, TaskType
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/__init__.py", line 23, in <module>
    from .other import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/other.py", line 21, in <module>
    import accelerate
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/__init__.py", line 16, in <module>
    from .accelerator import Accelerator
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/accelerator.py", line 35, in <module>
    from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/checkpointing.py", line 24, in <module>
    from .utils import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/__init__.py", line 182, in <module>
    from .fsdp_utils import load_fsdp_model, load_fsdp_optimizer, save_fsdp_model, save_fsdp_optimizer
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/fsdp_utils.py", line 26, in <module>
    import torch.distributed.checkpoint as dist_cp
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/checkpoint/__init__.py", line 2, in <module>
    from .default_planner import DefaultLoadPlanner, DefaultSavePlanner
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/checkpoint/default_planner.py", line 13, in <module>
    from torch.distributed._tensor import DTensor
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/__init__.py", line 6, in <module>
    import torch.distributed._tensor.ops
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/ops/__init__.py", line 2, in <module>
    from .embedding_ops import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/ops/embedding_ops.py", line 8, in <module>
    import torch.distributed._functional_collectives as funcol
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_functional_collectives.py", line 12, in <module>
    from . import _functional_collectives_impl as fun_col_impl
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_functional_collectives_impl.py", line 36, in <module>
    from torch._dynamo import assume_constant_result
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/__init__.py", line 2, in <module>
    from . import convert_frame, eval_frame, resume_execution
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 31, in <module>
    from torch.fx.experimental.symbolic_shapes import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/fx/experimental/symbolic_shapes.py", line 63, in <module>
    from torch.utils._sympy.functions import FloorDiv, Mod, IsNonOverlappingAndDenseIndicator
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/utils/_sympy/functions.py", line 1, in <module>
    import sympy
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/sympy/__init__.py", line 30, in <module>
    from sympy.core.cache import lazy_function
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/sympy/core/__init__.py", line 9, in <module>
    from .expr import Expr, AtomicExpr, UnevaluatedExpr
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/sympy/core/expr.py", line 4159, in <module>
    from .mul import Mul
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/sympy/core/mul.py", line 2193, in <module>
    from .numbers import Rational
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/sympy/core/numbers.py", line 5, in <module>
    import fractions
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/fractions.py", line 23, in <module>
    _RATIONAL_FORMAT = re.compile(r"""
                       ^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/__init__.py", line 227, in compile
    return _compile(pattern, flags)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/__init__.py", line 294, in _compile
    p = _compiler.compile(pattern, flags)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_compiler.py", line 745, in compile
    p = _parser.parse(p, flags)
        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 989, in parse
    p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 464, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 872, in _parse
    p = _parse_sub(source, state, sub_verbose, nested + 1)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 464, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 872, in _parse
    p = _parse_sub(source, state, sub_verbose, nested + 1)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 464, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 872, in _parse
    p = _parse_sub(source, state, sub_verbose, nested + 1)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 464, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 872, in _parse
    p = _parse_sub(source, state, sub_verbose, nested + 1)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 464, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 687, in _parse
    item = subpattern[-1:]
           ~~~~~~~~~~^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 170, in __getitem__
    return SubPattern(self.state, self.data[index])
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 113, in __init__
    def __init__(self, state, data=None):

KeyboardInterrupt
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 3, in <module>
    from llm2vec import LLM2Vec
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/__init__.py", line 1, in <module>
    from .llm2vec import LLM2Vec
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 10, in <module>
    from peft import PeftModel
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/__init__.py", line 22, in <module>
    from .auto import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/auto.py", line 31, in <module>
    from .config import PeftConfig
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/config.py", line 23, in <module>
    from .utils import CONFIG_NAME, PeftType, TaskType
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/__init__.py", line 23, in <module>
    from .other import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/other.py", line 21, in <module>
    import accelerate
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/__init__.py", line 16, in <module>
    from .accelerator import Accelerator
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/accelerator.py", line 35, in <module>
    from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/checkpointing.py", line 24, in <module>
    from .utils import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/__init__.py", line 182, in <module>
    from .fsdp_utils import load_fsdp_model, load_fsdp_optimizer, save_fsdp_model, save_fsdp_optimizer
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/fsdp_utils.py", line 26, in <module>
    import torch.distributed.checkpoint as dist_cp
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/checkpoint/__init__.py", line 2, in <module>
    from .default_planner import DefaultLoadPlanner, DefaultSavePlanner
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/checkpoint/default_planner.py", line 13, in <module>
    from torch.distributed._tensor import DTensor
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/__init__.py", line 6, in <module>
    import torch.distributed._tensor.ops
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/ops/__init__.py", line 2, in <module>
    from .embedding_ops import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/ops/embedding_ops.py", line 8, in <module>
    import torch.distributed._functional_collectives as funcol
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_functional_collectives.py", line 12, in <module>
    from . import _functional_collectives_impl as fun_col_impl
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_functional_collectives_impl.py", line 36, in <module>
    from torch._dynamo import assume_constant_result
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/__init__.py", line 2, in <module>
    from . import convert_frame, eval_frame, resume_execution
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 40, in <module>
    from . import config, exc, trace_rules
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/trace_rules.py", line 50, in <module>
    from .variables import (
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/variables/__init__.py", line 4, in <module>
    from .builtin import BuiltinVariable
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/variables/builtin.py", line 42, in <module>
    from .ctx_manager import EventVariable, StreamVariable
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/variables/ctx_manager.py", line 12, in <module>
    from ..device_interface import get_interface_for_device
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/device_interface.py", line 198, in <module>
    for i in range(torch.cuda.device_count()):
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/cuda/__init__.py", line 748, in device_count
    nvml_count = -1 if torch.version.hip else _device_count_nvml()
                                              ^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/cuda/__init__.py", line 709, in _device_count_nvml
    raw_cnt = _raw_device_count_nvml()
              ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/cuda/__init__.py", line 617, in _raw_device_count_nvml
    rc = nvml_h.nvmlInit()
         ^^^^^^^^^^^^^^^^^
KeyboardInterrupt
^CSegmentation fault (core dumped)
(test_mteb) slwanna@dracarys:~/code_projects/llm2vec_mteb/llm2vec/examples$ Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
    prepare(preparation_data)
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 3, in <module>
    from llm2vec import LLM2Vec
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/__init__.py", line 1, in <module>
    from .llm2vec import LLM2Vec
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 21, in <module>
    from .models import (
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/models/__init__.py", line 1, in <module>
    from .bidirectional_mistral import MistralBiModel, MistralBiForMNTP
  File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/models/bidirectional_mistral.py", line 4, in <module>
    from transformers import (
  File "<frozen importlib._bootstrap>", line 1229, in _handle_fromlist
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1501, in __getattr__
    value = getattr(module, name)
            ^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1500, in __getattr__
    module = self._get_module(self._class_to_module[name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1510, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/models/mistral/modeling_mistral.py", line 48, in <module>
    if is_flash_attn_2_available():
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 759, in is_flash_attn_2_available
    if not torch.cuda.is_available():
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/cuda/__init__.py", line 118, in is_available
    return torch._C._cuda_getDeviceCount() > 0
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt

vaibhavad commented 6 months ago

Hi @SouLeo,

Apologies for delay in the response.

When you run $ python test_sentence_transformers.py I believe only one GPU is being used, as sentence transformers have a different command for multi-GPU.

Regarding running multi-GPU with LLM2Vec, the code need to be shielded with if __name__. Otherwise, CUDA runs into issues when spawning new processes. This is a requirement in sentence transformers multi-GPU support as well.

I have modified your script and verified that it runs on 8xH100 server.

import datasets
import torch
from llm2vec import LLM2Vec
# from beir import util
# from beir.datasets.data_loader import GenericDataLoader as BeirDataLoader
import os 
from typing import Dict, List

# from beir.retrieval.evaluation import EvaluateRetrieval

def append_instruction(instruction, sentences):
    new_sentences = []
    for s in sentences:
        new_sentences.append([instruction, s, 0])
    return new_sentences

def cos_sim(a: torch.Tensor, b: torch.Tensor):
    if not isinstance(a, torch.Tensor):
        a = torch.tensor(a)

    if not isinstance(b, torch.Tensor):
        b = torch.tensor(b)

    if len(a.shape) == 1:
        a = a.unsqueeze(0)

    if len(b.shape) == 1:
        b = b.unsqueeze(0)

    a_norm = torch.nn.functional.normalize(a, p=2, dim=1)
    b_norm = torch.nn.functional.normalize(b, p=2, dim=1)
    return torch.mm(a_norm, b_norm.transpose(0, 1))

def encode_queries(queries: List[str], batch_size: int, **kwargs):
    new_sentences = append_instruction(instruction, queries)

    kwargs['show_progress_bar'] = False
    return model.encode(new_sentences, batch_size=batch_size, **kwargs)

def encode_corpus(corpus: List[Dict[str, str]], batch_size: int, **kwargs):
    if type(corpus) is dict:
        sentences = [
            (corpus["title"][i] + ' ' + corpus["text"][i]).strip()
            if "title" in corpus
            else corpus["text"][i].strip()
            for i in range(len(corpus["text"]))
        ]
    else:
        sentences = [
            (doc["title"] + ' ' + doc["text"]).strip() if "title" in doc else doc["text"].strip()
            for doc in corpus
        ]
    new_sentences = append_instruction("", sentences)
    return model.encode(new_sentences, batch_size=batch_size, **kwargs)

if __name__ == "__main__":
    dataset_name = "mteb/scidocs"
    instruction = "Given a scientific paper title, retrieve paper abstracts that are cited by the given paper: "

    print("Loading dataset...")
    queries = datasets.load_dataset(dataset_name, "queries")
    corpus = datasets.load_dataset(dataset_name, "corpus")

    batch_size = 2

    print("Loading model...")
    model = LLM2Vec.from_pretrained(
        "McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp",
        peft_model_name_or_path="McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-supervised",
        device_map="cuda" if torch.cuda.is_available() else "cpu",
        attn_implementation="flash_attention_2",
        torch_dtype=torch.bfloat16,
    )
    print("Encoding Queries...")
    query_ids = list(queries.keys())
    results = {qid: {} for qid in query_ids}
    queries = [queries[qid] for qid in queries]
    query_embeddings = encode_queries(queries[0]['text'][:2], batch_size=batch_size, show_progress_bar=True, convert_to_tensor=True)

Please check if this script is working on your end, and feel free to ask any other question.

Georgepitt commented 5 months ago

@vaibhavad I took a little snippet about how when I specify two Gpus to run, the model loads repeatedly.

logs: Traceback (most recent call last): File "", line 1, in File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/multiprocessing/spawn.py", line 125, in _main prepare(preparation_data) File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare _fixup_main_from_path(data['init_main_from_path']) File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path main_content = runpy.run_path(main_path, File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/runpy.py", line 265, in run_path return _run_module_code(code, init_globals, run_name, File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/runpy.py", line 97, in _run_module_code _run_code(code, mod_globals, init_globals, File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/share/home/chenyuxuan/Research_CodeSearch/llm2v/llm2vec_test/test_example.py", line 32, in q_reps = l2v.encode(queries) File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/llm2vec/llm2vec.py", line 325, in encode with cuda_compatible_multiprocess.Pool(num_proc) as p: File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/multiprocessing/context.py", line 119, in Pool return Pool(processes, initializer, initargs, maxtasksperchild, File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/multiprocessing/pool.py", line 212, in init self._repopulate_pool() File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/multiprocessing/pool.py", line 303, in _repopulate_pool return self._repopulate_pool_static(self._ctx, self.Process, File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/multiprocessing/pool.py", line 326, in _repopulate_pool_static w.start() File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/multiprocessing/process.py", line 121, in start self._popen = self._Popen(self) File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/multiprocessing/context.py", line 284, in _Popen return Popen(process_obj) File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in init super().init(process_obj) File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init self._launch(process_obj) File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 42, in _launch prep_data = spawn.get_preparation_data(process_obj._name) File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/multiprocessing/spawn.py", line 154, in get_preparation_data _check_not_importing_main() File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/multiprocessing/spawn.py", line 134, in _check_not_importing_main raise RuntimeError(''' RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s] Loading checkpoint shards: 100%|██████████| 3/3 [00:05<00:00, 1.68s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:05<00:00, 1.72s/it]1

Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s] Model and configuration loaded successfully!

0%| | 0/1 [00:00<?, ?it/s] 100%|██████████| 1/1 [00:00<00:00, 24966.10it/s] Loading checkpoint shards: 33%|███▎ | 1/3 [00:03<00:07, 3.87s/it] Loading checkpoint shards: 33%|███▎ | 1/3 [00:03<00:07, 3.74s/it] Loading checkpoint shards: 67%|██████▋ | 2/3 [00:07<00:03, 3.64s/it] Loading checkpoint shards: 67%|██████▋ | 2/3 [00:07<00:03, 3.56s/it]1

Start loading model Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s] Loading checkpoint shards: 100%|██████████| 3/3 [00:10<00:00, 3.40s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:10<00:00, 3.48s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:10<00:00, 3.35s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:10<00:00, 3.42s/it] Loading checkpoint shards: 33%|███▎ | 1/3 [00:03<00:07, 3.76s/it]

vaibhavad commented 5 months ago

Hello @Georgepitt,

Can you share the test_example.py file that you are using?

Georgepitt commented 5 months ago

Of course! Here is test_example.py and my task submission script. I run test_example.py in the form of a task submission such as sbatch emxaple.sh. The above error occurs when the number of Gpus(#SBATCH -G) specified by 'sbatch emxaple.sh' is greater than 1.

emxample.sh

#!/bin/bash
#SBATCH -p gpu_se
#SBATCH -n 1
#SBATCH -G 2
#SBATCH -o /share/home/chenyuxuan/Research_CodeSearch/llm2v/llm2vec_test/run_out/job_exmaple.out

~/.conda/envs/LLM2Vec/bin/python /share/home/llm2v/llm2vec_test/test_example.py

test_example.py

import json
import torch
from llm2vec import LLM2Vec
import os
import numpy as np

os.environ["HF_HUB_OFFLINE"] = "1"
os.environ["HF_HOME"] = "/share/home/chenyuxuan/.cache/huggingface/hub"
os.environ["TOKENIZERS_PARALLELISM"] = "false"

path ='/share/home/.cache/huggingface/hub/models--McGill-NLP--LLM2Vec-Mistral-7B-Instruct-v2-mntp/snapshots/5ec8e6444af63627e7609f38641de612c6de0105'
path2 = "/share/home/.cache/huggingface/hub/models--McGill-NLP--LLM2Vec-Mistral-7B-Instruct-v2-mntp-unsup-simcse/snapshots/2c055a5d77126c0d3dc6cd8ffa30e2908f4f45f8"

print("Start loading model")
l2v = LLM2Vec.from_pretrained(
    path,
    peft_model_name_or_path=path2,
    device_map="cuda" if torch.cuda.is_available() else "cpu",
    torch_dtype=torch.bfloat16,
)
print("Model and configuration loaded successfully!")
# Encoding queries using instructions
instruction = (
    "Given a web search query, retrieve relevant passages that answer the query:"
)
queries = [
    [instruction, "how much protein should a female eat"],
    [instruction, "summit define"],
]
q_reps = l2v.encode(queries)

# Encoding documents. Instruction are not required for documents
documents = [
    "As a general guideline, the CDC's average requirement of protein for women ages 19 to 70 is 46 grams per day. But, as you can see from this chart, you'll need to increase that if you're expecting or training for a marathon. Check out the chart below to see how much protein you should be eating each day.",
    "Definition of summit for English Language Learners. : 1  the highest point of a mountain : the top of a mountain. : 2  the highest level. : 3  a meeting or series of meetings between the leaders of two or more governments.",
]
d_reps = l2v.encode(documents)

# Compute cosine similarity
q_reps_norm = torch.nn.functional.normalize(q_reps, p=2, dim=1)
d_reps_norm = torch.nn.functional.normalize(d_reps, p=2, dim=1)
cos_sim = torch.mm(q_reps_norm, d_reps_norm.transpose(0, 1))

print(cos_sim)
"""
tensor([[0.6470, 0.1619],
        [0.0786, 0.5844]])
"""

vaibhavad commented 5 months ago

Hi @Georgepitt, please refer to my response above

Regarding running multi-GPU with LLM2Vec, the code need to be shielded with if __name__. Otherwise, CUDA runs into issues when spawning new processes. This is a requirement in sentence transformers multi-GPU support as well.

You'll need to modify test_example.py accordingly. Let me know if you have any more questions

GeraldWu23 commented 5 months ago

Thank you very much for your help@vaibhavad! I successfully run the project locally. In fact, the really important Settings here are in the adapter_config.json file, which contains the two fields "base_model_name_or_path" and "parent_library", if you map them locally, then the model will run.

This is my setup： import os os.environ["HF_HUB_OFFLINE"] = "1" os.environ["HF_HOME"] = "" os.environ["TOKENIZERS_PARALLELISM"] = "false"

One unusual thing is that I can only specify one gpu to run, otherwise it will load the model repeatedly in the l2v.encode section.

I am working on the inference demo locally, and I run into the error huggingface_hub.errors.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name'. Did you see the same error and resolve it? What I have done after downloading the model is modifying the base_model_name_or_path to my local dir. (I didnt modify the parent_library since I am not sure how), and mv the original xxxx.safetensor to model.safetensor to avoid another issue.

hch1231 commented 5 months ago

In fact, the really important Settings here are in the adapter_config.json file, which contains the two fields "base_model_name_or_path" and "parent_library", if you map them locally, then the model will run.

I'm a bit confused about how to "map locally" as you mentioned. Could you give an example of how to modify the "base_model_name_or_path" and "parent_library" fields in the adapter_config.json file? Thanks a lot!

saikot-paul commented 5 months ago

Suppose I had Llama3 weights downloaded previously is there any way I can use that for this library?

vaibhavad commented 4 months ago

@saikot-paul - Yes, llm2vec adapters will automatically apply on top of downloaded Llama 3 if you follow the model loading instructions described here.

vaibhavad commented 4 months ago

Closing as it is stale. Feel free to re-open if you have any more questions.

McGill-NLP / llm2vec

Requires locally run scripts #52

Loading base Mistral model, along with custom code that enables bidirectional connections in decoder-only LLMs.

Loading MNTP (Masked Next Token Prediction) model.

this command will return local address, lets call it