Closed Georgepitt closed 4 months ago
Hi @Georgepitt, thanks for your interest in our work.
We also faced this issue during our model development, PEFT library pushed a fix 2 months ago so the latest version should support offline loading.
Here is what I have tried on my end
from huggingface_hub import snapshot_download
snapshot_download(repo_id="McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp")
# this command will return local address, lets call it <MNTP_LOCAL_PATH>
snapshot_download(repo_id="McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-unsup-simcse")
2. In a separate session without the internet, start python interactive/script with `HF_HUB_OFFLINE=1` option
```bash
HF_HUB_OFFLINE=1 python
import torch
from llm2vec import LLM2Vec
l2v = LLM2Vec.from_pretrained(
"
Here are details of relevant library versions in my environment
```bash
huggingface-hub 0.22.2
peft 0.10.0
transformers 4.40.1
Let me know if you have any further questions.
Hello! I'm in a similar boat. I tried running your script, but for the llama-3-8b model and am having issues as well.
I run the following (without internet connection):
import torch
from llm2vec import LLM2Vec
# https://github.com/McGill-NLP/llm2vec/issues/52
l2v = LLM2Vec.from_pretrained(
"<LOCAL PATH to McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp>",
peft_model_name_or_path="<LOCAL PATH to https://huggingface.co/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-unsup-simcse>",
device_map="cuda" if torch.cuda.is_available() else "cpu",
torch_dtype=torch.bfloat16,
)
However, I still get the following:
huggingface_hub.errors.OfflineModeIsEnabled: Cannot reach https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/resolve/main/config.json: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.
I understand that I still need the underlying llama-3 model from Meta (I do have access to it), but I don't know how to link to where I have that model stored locally. Is there a simple fix?
Thank you!!
Thank you for your advice!@vaibhavad. I've followed the advice, but it still doesn't work.I don't know what went wrong. Could you give me some advice? Thank you!
download the models: python model_download.py Fetching 11 files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 153791.15it/s] MNTP model path: /home/.cache/huggingface/hub/models--McGill-NLP--LLM2Vec-Mistral-7B-Instruct-v2-mntp/snapshots/5ec8e6444af63627e7609f38641de612c6de0105 Fetching 4 files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 53430.62it/s] SimCSE model path: /home/.cache/huggingface/hub/models--McGill-NLP--LLM2Vec-Mistral-7B-Instruct-v2-mntp-unsup-simcse/snapshots/2c055a5d77126c0d3dc6cd8ffa30e2908f4f45f8
running script: HF_HUB_OFFLINE=1 ~/.conda/envs/LLM2Vec/bin/python /share/home/Mistral.py
and than it return this errors: OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like mistralai/Mistral-7B-Instruct-v0.2 is not the path to a directory containing a file named config.json. Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
I believe you also need to download mistralai/Mistral-7B-Instruct-v0.2
. You can run this command with internet connection
HF_HOME=<CACHE_DIR> python -c "import transformers; transformers.AutoModel.from_pretrained('mistralai/Mistral-7B-Instruct-v0.2')"
Make sure to specify CACHE_DIR as a directory which is accessible to you in the offline mode
After this if you launch python with
HF_HOME=<CACHE_DIR> HF_HUB_OFFLINE=1 python
then all model loading should work as expected.
Let me know iff these steps fix your issue
Thank you very much for your help@vaibhavad! I successfully run the project locally. In fact, the really important Settings here are in the adapter_config.json file, which contains the two fields "base_model_name_or_path" and "parent_library", if you map them locally, then the model will run.
This is my setup:
import os
os.environ["HF_HUB_OFFLINE"] = "1"
os.environ["HF_HOME"] = "
One unusual thing is that I can only specify one gpu to run, otherwise it will load the model repeatedly in the l2v.encode section.
@Georgepitt, glad to know the issue is resolved.
One unusual thing is that I can only specify one gpu to run, otherwise it will load the model repeatedly in the l2v.encode section.
I did not fully understand this. Can you provide more details? By default, encode tries to use all the GPUs available.
It's possible that I have the same issue. In order to run your code on my system, I have to comment out lines 341-361 in llm2vec.py. If I don't comment out lines 341-361, the following output occurs.
Also, I know this output is information overload. If you could direct me to some specific outputs / logs you need to better understand the issue, I will follow up with those details.
Nvidia-SMI for CUDA:0 Device (all other devices on my machine are still unused):
Mon May 13 09:35:33 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A5000 Off | 00000000:1B:00.0 Off | Off |
| 30% 28C P8 17W / 230W | 4MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
Mon May 13 09:35:36 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A5000 Off | 00000000:1B:00.0 Off | Off |
| 30% 30C P2 59W / 230W | 2043MiB / 24564MiB | 31% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
Mon May 13 09:35:37 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A5000 Off | 00000000:1B:00.0 Off | Off |
| 30% 30C P2 59W / 230W | 4675MiB / 24564MiB | 20% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
Mon May 13 09:35:39 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A5000 Off | 00000000:1B:00.0 Off | Off |
| 30% 30C P2 58W / 230W | 6131MiB / 24564MiB | 17% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
Mon May 13 09:35:41 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A5000 Off | 00000000:1B:00.0 Off | Off |
| 30% 30C P2 59W / 230W | 9767MiB / 24564MiB | 20% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
Mon May 13 09:35:43 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A5000 Off | 00000000:1B:00.0 Off | Off |
| 30% 31C P2 59W / 230W | 11591MiB / 24564MiB | 21% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
Mon May 13 09:35:45 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A5000 Off | 00000000:1B:00.0 Off | Off |
| 30% 31C P2 59W / 230W | 13915MiB / 24564MiB | 21% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
Mon May 13 09:35:46 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A5000 Off | 00000000:1B:00.0 Off | Off |
| 30% 31C P2 59W / 230W | 15147MiB / 24564MiB | 21% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
Mon May 13 09:35:51 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A5000 Off | 00000000:1B:00.0 Off | Off |
| 30% 33C P2 78W / 230W | 16137MiB / 24564MiB | 1% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
Mon May 13 09:36:06 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A5000 Off | 00000000:1B:00.0 Off | Off |
| 30% 35C P2 81W / 230W | 21667MiB / 24564MiB | 18% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
Terminal outputs:
(test_mteb) slwanna@lepp:~/mteb$ python other_test.py
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 10.96it/s]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:13<00:00, 3.26s/it]
Some weights of the model checkpoint at meta-llama/Meta-Llama-3-8B-Instruct were not used when initializing LlamaEncoderModel: ['lm_head.weight']
- This IS expected if you are initializing LlamaEncoderModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing LlamaEncoderModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 32513.98it/s]
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.92it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 12.55it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 12.25it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 9.58it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 10.83it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.80it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.81it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 12.89it/s]
Loading checkpoint shards: 0%| | 0/4 [00:01<?, ?it/s]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
Loading checkpoint shards: 0%| | 0/4 [00:08<?, ?it/s]
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
Traceback (most recent call last):
File "<string>", line 1, in <module>
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
) = cls._load_pretrained_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB (GPU 0; 23.68 GiB total capacity; 4.56 GiB already allocated; 128.00 KiB free; 4.61 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Loading checkpoint shards: 0%| | 0/4 [00:01<?, ?it/s]
Traceback (most recent call last):
File "<string>", line 1, in <module>
Loading checkpoint shards: 0%| | 0/4 [00:01<?, ?it/s] File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
Traceback (most recent call last):
File "<string>", line 1, in <module>
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
return _run_module_code(code, init_globals, run_name,
Loading checkpoint shards: 0%| | 0/4 [00:01<?, ?it/s] File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
Traceback (most recent call last):
main_content = runpy.run_path(main_path,
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
) = cls._load_pretrained_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
Loading checkpoint shards: 0%| | 0/4 [00:01<?, ?it/s]
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
) = cls._load_pretrained_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
Traceback (most recent call last):
File "<string>", line 1, in <module>
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
new_value = value.to(device)
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
) = cls._load_pretrained_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Loading checkpoint shards: 0%| | 0/4 [00:01<?, ?it/s]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Loading checkpoint shards: 0%| | 0/4 [00:01<?, ?it/s]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.77it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 12.19it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.69it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.50it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 12.21it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.26it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.13it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00, 2.78it/s]
Loading checkpoint shards: 0%| | 0/4 [00:07<?, ?it/s]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB (GPU 0; 23.68 GiB total capacity; 4.19 GiB already allocated; 24.12 MiB free; 4.23 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Loading checkpoint shards: 0%| | 0/4 [00:01<?, ?it/s]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
Loading checkpoint shards: 0%| | 0/4 [00:01<?, ?it/s] exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
) = cls._load_pretrained_model(
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
) = cls._load_pretrained_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Loading checkpoint shards: 0%| | 0/4 [00:01<?, ?it/s]
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
Loading checkpoint shards: 0%| | 0/4 [00:01<?, ?it/s]
) = cls._load_pretrained_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Loading checkpoint shards: 0%| | 0/4 [00:01<?, ?it/s]
) = cls._load_pretrained_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
new_value = value.to(device)
return _run_module_code(code, init_globals, run_name,
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1. File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
Loading checkpoint shards: 0%| | 0/4 [00:02<?, ?it/s]
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Loading checkpoint shards: 0%| | 0/4 [00:01<?, ?it/s]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.64it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.35it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 10.05it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.78it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 10.89it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 12.42it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.47it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.78it/s]
^CTraceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in __init__
[ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in <listcomp>
[ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 47, in __init__
self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 17, in __init__
super().__init__(*args, **kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 291, in __init__
self._init_rope()
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 295, in _init_rope
self.rotary_emb = LlamaRotaryEmbedding(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 112, in __init__
self.register_buffer("_cos_cached", emb.cos().to(torch.get_default_dtype()), persistent=False)
KeyboardInterrupt
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in __init__
[ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in <listcomp>
[ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 47, in __init__
self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 17, in __init__
super().__init__(*args, **kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 291, in __init__
self._init_rope()
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 295, in _init_rope
self.rotary_emb = LlamaRotaryEmbedding(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 113, in __init__
self.register_buffer("_sin_cached", emb.sin().to(torch.get_default_dtype()), persistent=False)
KeyboardInterrupt
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
Traceback (most recent call last):
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in __init__
[ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in <listcomp>
[ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 47, in __init__
self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 17, in __init__
super().__init__(*args, **kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 291, in __init__
self._init_rope()
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 295, in _init_rope
self.rotary_emb = LlamaRotaryEmbedding(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 112, in __init__
self.register_buffer("_cos_cached", emb.cos().to(torch.get_default_dtype()), persistent=False)
KeyboardInterrupt
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in __init__
[ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in <listcomp>
[ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 47, in __init__
self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 17, in __init__
super().__init__(*args, **kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 291, in __init__
self._init_rope()
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 295, in _init_rope
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
self.rotary_emb = LlamaRotaryEmbedding(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 113, in __init__
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
self.register_buffer("_sin_cached", emb.sin().to(torch.get_default_dtype()), persistent=False)
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
KeyboardInterrupt
_fixup_main_from_path(data['init_main_from_path'])
Traceback (most recent call last):
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
Traceback (most recent call last):
exitcode = _main(fd, parent_sentinel)
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in __init__
[ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in <listcomp>
[ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 47, in __init__
self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 17, in __init__
super().__init__(*args, **kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 291, in __init__
self._init_rope()
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 295, in _init_rope
model = cls(config, *model_args, **model_kwargs)
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in __init__
self.rotary_emb = LlamaRotaryEmbedding(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 112, in __init__
[ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in <listcomp>
[ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 47, in __init__
self.register_buffer("_cos_cached", emb.cos().to(torch.get_default_dtype()), persistent=False)
KeyboardInterrupt
self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 17, in __init__
super().__init__(*args, **kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 291, in __init__
self._init_rope()
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 295, in _init_rope
self.rotary_emb = LlamaRotaryEmbedding(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 111, in __init__
emb = torch.cat((freqs, freqs), dim=-1)
KeyboardInterrupt
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in __init__
[ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 62, in <listcomp>
[ModifiedLlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 47, in __init__
self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx)
File "/home/slwanna/.cache/huggingface/modules/transformers_modules/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp/875d1f2e85efff4875b2ab8bcdad3e3269a8d2b3/modeling_llama_encoder.py", line 17, in __init__
super().__init__(*args, **kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 291, in __init__
self._init_rope()
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 295, in _init_rope
self.rotary_emb = LlamaRotaryEmbedding(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 113, in __init__
self.register_buffer("_sin_cached", emb.sin().to(torch.get_default_dtype()), persistent=False)
KeyboardInterrupt
Segmentation fault (core dumped)
Loading checkpoint shards: 0%| | 0/4 [00:02<?, ?it/s]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/slwanna/mteb/other_test.py", line 18, in <module>
model = AutoModel.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
KeyboardInterrupt
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 6 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
(test_mteb) slwanna@lepp:~/mteb$
Hi @SouLeo, can you share your other_test.py
file? I will try to run it on my end. Also, how many GPUs are on your node when you run it and what is the CPU RAM?
As a reference, our multi-GPU encoding implementation is similar to sentence-transformers library implementation
Sure thing!
Here is other_test.py
from llm2vec import LLM2Vec
import torch
from transformers import AutoTokenizer, AutoModel, AutoConfig
from peft import PeftModel
from mteb import MTEB
MODEL_NAME = "McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp"
# Loading base Mistral model, along with custom code that enables bidirectional connections in decoder-only LLMs. MNTP LoRA weights are merged into the base model.
tokenizer = AutoTokenizer.from_pretrained(
MODEL_NAME
)
config = AutoConfig.from_pretrained(
MODEL_NAME, trust_remote_code=True
)
model = AutoModel.from_pretrained(
MODEL_NAME,
trust_remote_code=True,
config=config,
torch_dtype=torch.float16,
device_map="cuda" if torch.cuda.is_available() else "cpu",
)
model = PeftModel.from_pretrained(
model,
MODEL_NAME,
)
model = model.merge_and_unload() # This can take several minutes on cpu
# Loading unsupervised SimCSE model. This loads the trained LoRA weights on top of MNTP model. Hence the final weights are -- Base model + MNTP (LoRA) + SimCSE (LoRA).
model = PeftModel.from_pretrained(
model, "McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervised"
)
# Wrapper for encoding and pooling operations
l2v = LLM2Vec(model, tokenizer, pooling_mode="mean", max_length=512)
# model_name = "llama3"
# evaluation = MTEB(tasks=["Banking77Classification"])
# results = evaluation.run(l2v, output_folder=f"results/{model_name}")
# Encoding queries using instructions
instruction = (
"Given a web search query, retrieve relevant passages that answer the query:"
)
queries = [
[instruction, "how much protein should a female eat"],
[instruction, "summit define"],
]
q_reps = l2v.encode(queries)
# Encoding documents. Instruction are not required for documents
documents = [
"As a general guideline, the CDC's average requirement of protein for women ages 19 to 70 is 46 grams per day. But, as you can see from this chart, you'll need to increase that if you're expecting or training for a marathon. Check out the chart below to see how much protein you should be eating each day.",
"Definition of summit for English Language Learners. : 1 the highest point of a mountain : the top of a mountain. : 2 the highest level. : 3 a meeting or series of meetings between the leaders of two or more governments.",
]
d_reps = l2v.encode(documents)
# Compute cosine similarity
q_reps_norm = torch.nn.functional.normalize(q_reps, p=2, dim=1)
d_reps_norm = torch.nn.functional.normalize(d_reps, p=2, dim=1)
cos_sim = torch.mm(q_reps_norm, d_reps_norm.transpose(0, 1))
print(cos_sim)
"""
tensor([[0.6470, 0.1619],
[0.0786, 0.5844]])
"""
CPU RAM:
(base) slwanna@lepp:~$ free -g
total used free shared buff/cache available
Mem: 1510 9 1338 0 163 1493
Swap: 1 1 0
I have 8 NVIDIA RTX A5000 on my node.
I will also look into the implementation you linked.
Hi all, I have moved my code to an internet-connected, server with 8xH100's. I'm having similar issues with your multigpu .encode() function. See below.
I'm still investigating this and don't want to see this issue get stale. But, I just wanted to double check that you had tested your encode function on multigpu systems.
I have run the following sentence transformers model as a test and had no issues:
$ python test_sentence_transformers.py
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("hkunlp/instructor-xl")
# Our sentences we like to encode
sentences = [
"This framework generates embeddings for each input sentence",
"Sentences are passed as a list of string.",
"The quick brown fox jumps over the lazy dog.",
]
# Sentences are encoded by calling model.encode()
sentence_embeddings = model.encode(sentences)
# Print the embeddings
for sentence, embedding in zip(sentences, sentence_embeddings):
print("Sentence:", sentence)
print("Embedding:", embedding)
print("")
However, when I run
$ python retrieval_scidocs.py
import datasets
import torch
from llm2vec import LLM2Vec
# from beir import util
# from beir.datasets.data_loader import GenericDataLoader as BeirDataLoader
import os
from typing import Dict, List
# from beir.retrieval.evaluation import EvaluateRetrieval
dataset_name = "mteb/scidocs"
instruction = "Given a scientific paper title, retrieve paper abstracts that are cited by the given paper: "
print("Loading dataset...")
queries = datasets.load_dataset(dataset_name, "queries")
corpus = datasets.load_dataset(dataset_name, "corpus")
batch_size = 2
print("Loading model...")
model = LLM2Vec.from_pretrained(
"McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp",
peft_model_name_or_path="McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-supervised",
device_map="cuda" if torch.cuda.is_available() else "cpu",
attn_implementation="flash_attention_2",
torch_dtype=torch.bfloat16,
)
def append_instruction(instruction, sentences):
new_sentences = []
for s in sentences:
new_sentences.append([instruction, s, 0])
return new_sentences
def cos_sim(a: torch.Tensor, b: torch.Tensor):
if not isinstance(a, torch.Tensor):
a = torch.tensor(a)
if not isinstance(b, torch.Tensor):
b = torch.tensor(b)
if len(a.shape) == 1:
a = a.unsqueeze(0)
if len(b.shape) == 1:
b = b.unsqueeze(0)
a_norm = torch.nn.functional.normalize(a, p=2, dim=1)
b_norm = torch.nn.functional.normalize(b, p=2, dim=1)
return torch.mm(a_norm, b_norm.transpose(0, 1))
def encode_queries(queries: List[str], batch_size: int, **kwargs):
new_sentences = append_instruction(instruction, queries)
kwargs['show_progress_bar'] = False
return model.encode(new_sentences, batch_size=batch_size, **kwargs)
def encode_corpus(corpus: List[Dict[str, str]], batch_size: int, **kwargs):
if type(corpus) is dict:
sentences = [
(corpus["title"][i] + ' ' + corpus["text"][i]).strip()
if "title" in corpus
else corpus["text"][i].strip()
for i in range(len(corpus["text"]))
]
else:
sentences = [
(doc["title"] + ' ' + doc["text"]).strip() if "title" in doc else doc["text"].strip()
for doc in corpus
]
new_sentences = append_instruction("", sentences)
return model.encode(new_sentences, batch_size=batch_size, **kwargs)
print("Encoding Queries...")
query_ids = list(queries.keys())
results = {qid: {} for qid in query_ids}
queries = [queries[qid] for qid in queries]
query_embeddings = encode_queries(queries[0]['text'][:2], batch_size=batch_size, show_progress_bar=True, convert_to_tensor=True)
I again get errors:
Loading dataset...
Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 12.28it/s]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:02<00:00, 1.05it/s]
Encoding Queries...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 53092.46it/s]
Loading dataset...
Loading dataset...
Loading dataset...
Loading dataset...
Loading dataset...
Loading dataset...
Loading dataset...
Loading dataset...
Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Downloading shards: 0%| | 0/3 [00:00<?, ?it/s]Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 12.55it/s]
Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Downloading shards: 67%|████████████████████████████████████████████████████████████████▋ | 2/3 [00:00<00:00, 11.49it/s]Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 12.05it/s]
Downloading shards: 0%| | 0/3 [00:00<?, ?it/s]Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 11.85it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 11.72it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 12.25it/s]
Downloading shards: 0%| | 0/3 [00:00<?, ?it/s]Loading model...
/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 11.83it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 12.95it/s]
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 6.08it/s]
Loading checkpoint shards: 33%|██████████████████████████████ | 1/3 [00:02<00:05, 2.66s/it]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 21, in <module>
model = LLM2Vec.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 96, in from_pretrained
model = model_class.from_pretrained(base_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU
Loading checkpoint shards: 33%|██████████████████████████████ | 1/3 [00:02<00:05, 2.79s/it]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 21, in <module>
model = LLM2Vec.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 96, in from_pretrained
model = model_class.from_pretrained(base_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU
Loading checkpoint shards: 67%|████████████████████████████████████████████████████████████ | 2/3 [00:03<00:01, 1.64s/it]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 21, in <module>
model = LLM2Vec.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 96, in from_pretrained
model = model_class.from_pretrained(base_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
Loading checkpoint shards: 33%|██████████████████████████████ | 1/3 [00:02<00:05, 2.96s/it]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 21, in <module>
model = LLM2Vec.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 96, in from_pretrained
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
model = model_class.from_pretrained(base_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU
Loading checkpoint shards: 33%|██████████████████████████████ | 1/3 [00:02<00:04, 2.50s/it]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 21, in <module>
model = LLM2Vec.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 96, in from_pretrained
model = model_class.from_pretrained(base_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU
Loading checkpoint shards: 33%|██████████████████████████████ | 1/3 [00:02<00:04, 2.26s/it]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 21, in <module>
model = LLM2Vec.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 96, in from_pretrained
model = model_class.from_pretrained(base_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU
Loading checkpoint shards: 33%|██████████████████████████████ | 1/3 [00:01<00:03, 1.78s/it]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 21, in <module>
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:03<00:00, 1.21s/it] model = LLM2Vec.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 96, in from_pretrained
model = model_class.from_pretrained(base_model_name_or_path, **kwargs)
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:03<00:00, 1.30s/it]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4104, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 886, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 400, in set_module_tensor_to_device
new_value = value.to(device)
^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 21, in <module>
model = LLM2Vec.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 96, in from_pretrained
model = model_class.from_pretrained(base_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3735, in from_pretrained
dispatch_model(model, **device_map_kwargs)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/big_modeling.py", line 488, in dispatch_model
model.to(device)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_utils.py", line 2692, in to
return super().to(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1173, in to
return self._apply(convert)
^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/nn/modules/module.py", line 779, in _apply
module._apply(fn)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/nn/modules/module.py", line 779, in _apply
module._apply(fn)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/nn/modules/module.py", line 779, in _apply
module._apply(fn)
[Previous line repeated 1 more time]
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/nn/modules/module.py", line 853, in _apply
self._buffers[key] = fn(buf)
^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1159, in convert
return t.to(
^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU
^CTraceback (most recent call last):
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
Traceback (most recent call last):
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
exitcode = _main(fd, parent_sentinel)
exitcode = _main(fd, parent_sentinel)
prepare(preparation_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^
Traceback (most recent call last):
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
prepare(preparation_data)
File "<string>", line 1, in <module>
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
_fixup_main_from_path(data['init_main_from_path'])
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
prepare(preparation_data)
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
^^^^^^^^^^^^^^^^^^^^^^^^^
_fixup_main_from_path(data['init_main_from_path'])
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
main_content = runpy.run_path(main_path,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 3, in <module>
^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen runpy>", line 291, in run_path
prepare(preparation_data)
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 2, in <module>
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
from llm2vec import LLM2Vec
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/__init__.py", line 1, in <module>
File "<frozen runpy>", line 88, in _run_code
main_content = runpy.run_path(main_path,
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 3, in <module>
^^^^^^^^^^^^^^^^^^^^^^^^^
_fixup_main_from_path(data['init_main_from_path'])
File "<frozen runpy>", line 291, in run_path
import torch
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 3, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/__init__.py", line 1919, in <module>
from llm2vec import LLM2Vec
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/__init__.py", line 1, in <module>
from .llm2vec import LLM2Vec
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 10, in <module>
from llm2vec import LLM2Vec
main_content = runpy.run_path(main_path,
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/__init__.py", line 1, in <module>
^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 3, in <module>
from peft import PeftModel
from .llm2vec import LLM2Vec
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/__init__.py", line 22, in <module>
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 10, in <module>
from .llm2vec import LLM2Vec
from llm2vec import LLM2Vec
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 10, in <module>
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/__init__.py", line 1, in <module>
from .auto import (
from peft import PeftModel
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/auto.py", line 21, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/__init__.py", line 22, in <module>
from peft import PeftModel
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/__init__.py", line 22, in <module>
from .auto import (
from transformers import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/auto.py", line 31, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/__init__.py", line 26, in <module>
from .llm2vec import LLM2Vec
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 10, in <module>
from . import _meta_registrations
from .auto import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/auto.py", line 31, in <module>
from .config import PeftConfig
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_meta_registrations.py", line 9, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/config.py", line 23, in <module>
from peft import PeftModel
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/__init__.py", line 22, in <module>
from .config import PeftConfig
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/config.py", line 23, in <module>
from torch._decomp import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_decomp/__init__.py", line 244, in <module>
from .utils import CONFIG_NAME, PeftType, TaskType
from .auto import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/auto.py", line 31, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/__init__.py", line 23, in <module>
from .utils import CONFIG_NAME, PeftType, TaskType
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/__init__.py", line 23, in <module>
from .config import PeftConfig
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/config.py", line 23, in <module>
from .other import (
from .other import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/other.py", line 21, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/other.py", line 21, in <module>
import torch._decomp.decompositions
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_decomp/decompositions.py", line 11, in <module>
from .utils import CONFIG_NAME, PeftType, TaskType
from . import dependency_versions_check
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/__init__.py", line 23, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/dependency_versions_check.py", line 16, in <module>
from .other import (
from .utils.versions import require_version, require_version_core
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/other.py", line 21, in <module>
import accelerate
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/utils/__init__.py", line 33, in <module>
import accelerate
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/__init__.py", line 16, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/__init__.py", line 16, in <module>
import accelerate
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/__init__.py", line 16, in <module>
from .accelerator import Accelerator
from .accelerator import Accelerator
from .generic import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/accelerator.py", line 35, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/accelerator.py", line 35, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/utils/generic.py", line 31, in <module>
from .accelerator import Accelerator
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/accelerator.py", line 35, in <module>
from .import_utils import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 95, in <module>
import torch._prims as prims
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_prims/__init__.py", line 3031, in <module>
_accelerate_available, _accelerate_version = _is_package_available("accelerate", return_version=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 48, in _is_package_available
from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/checkpointing.py", line 24, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/checkpointing.py", line 24, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/checkpointing.py", line 24, in <module>
package_version = importlib.metadata.version(pkg_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/importlib/metadata/__init__.py", line 1009, in version
from .utils import (
from .utils import (
from .utils import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/__init__.py", line 192, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/__init__.py", line 182, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/__init__.py", line 182, in <module>
from .megatron_lm import (
register_debug_prims()
from .fsdp_utils import load_fsdp_model, load_fsdp_optimizer, save_fsdp_model, save_fsdp_optimizer
from .fsdp_utils import load_fsdp_model, load_fsdp_optimizer, save_fsdp_model, save_fsdp_optimizer
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/megatron_lm.py", line 32, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/fsdp_utils.py", line 26, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/fsdp_utils.py", line 26, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_prims/debug_prims.py", line 40, in register_debug_prims
return distribution(distribution_name).version
@load_tensor.impl_factory()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
import torch.distributed.checkpoint as dist_cp
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/importlib/metadata/__init__.py", line 632, in version
import torch.distributed.checkpoint as dist_cp
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_custom_op/impl.py", line 333, in inner
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/checkpoint/__init__.py", line 11, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/checkpoint/__init__.py", line 2, in <module>
from .optimizer import load_sharded_optimizer_state_dict
from .default_planner import DefaultLoadPlanner, DefaultSavePlanner
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/checkpoint/optimizer.py", line 40, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/checkpoint/default_planner.py", line 13, in <module>
self._register_impl("factory", f)
from transformers.modeling_outputs import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/modeling_outputs.py", line 195, in <module>
return self.metadata['Version']
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_custom_op/impl.py", line 223, in _register_impl
^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/importlib/metadata/__init__.py", line 617, in metadata
frame = inspect.getframeinfo(sys._getframe(stacklevel))
from torch.distributed._tensor import DTensor
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/__init__.py", line 6, in <module>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/inspect.py", line 1688, in getframeinfo
return _adapters.Message(email.message_from_string(text))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/email/__init__.py", line 37, in message_from_string
@dataclass
^^^^^^^^^
import torch.distributed._tensor.ops
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/dataclasses.py", line 1232, in dataclass
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/ops/__init__.py", line 2, in <module>
from .embedding_ops import * # noqa: F403
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/ops/embedding_ops.py", line 8, in <module>
import torch.distributed._functional_collectives as funcol
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_functional_collectives.py", line 12, in <module>
from . import _functional_collectives_impl as fun_col_impl
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_functional_collectives_impl.py", line 36, in <module>
from torch._dynamo import assume_constant_result
return wrap(cls)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/__init__.py", line 2, in <module>
^^^^^^^^^
from torch.distributed.fsdp._shard_utils import _create_chunk_sharded_tensor
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/dataclasses.py", line 1222, in wrap
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/fsdp/__init__.py", line 2, in <module>
from . import convert_frame, eval_frame, resume_execution
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 31, in <module>
lines, lnum = findsource(frame)
^^^^^^^^^^^^^^^^^
return _process_class(cls, init, repr, eq, order, unsafe_hash,
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/inspect.py", line 1071, in findsource
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
from torch.fx.experimental.symbolic_shapes import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/fx/experimental/symbolic_shapes.py", line 63, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/dataclasses.py", line 1056, in _process_class
module = getmodule(object, file)
return Parser(*args, **kws).parsestr(s)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/inspect.py", line 988, in getmodule
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/email/parser.py", line 67, in parsestr
if ismodule(module) and hasattr(module, '__file__'):
^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
_cmp_fn('__eq__', '==',
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/dataclasses.py", line 630, in _cmp_fn
return _create_fn(name,
^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/dataclasses.py", line 424, in _create_fn
return self.parse(StringIO(text), headersonly=headersonly)
from .fully_sharded_data_parallel import (
args = ','.join(args)
^^^^^^^^^^^^^^
File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
KeyboardInterrupt
File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
from torch.utils._sympy.functions import FloorDiv, Mod, IsNonOverlappingAndDenseIndicator
File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 936, in exec_module
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/email/parser.py", line 56, in parse
File "<frozen importlib._bootstrap_external>", line 1069, in get_code
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/utils/_sympy/functions.py", line 269, in <module>
File "<frozen importlib._bootstrap_external>", line 729, in _compile_bytecode
KeyboardInterrupt
class RShift(sympy.Function):
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/sympy/core/basic.py", line 121, in __init_subclass__
feedparser.feed(data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/email/feedparser.py", line 174, in feed
self._call_parse()
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/email/feedparser.py", line 178, in _call_parse
self._parse()
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/email/feedparser.py", line 238, in _parsegen
self._parse_headers(headers)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/email/feedparser.py", line 486, in _parse_headers
self._cur.set_raw(*self.policy.header_source_parse(lastvalue))
KeyboardInterrupt
_prepare_class_assumptions(cls)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/sympy/core/assumptions.py", line 642, in _prepare_class_assumptions
eval_is_meth = getattr(cls, '_eval_is_%s' % k, None)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 3, in <module>
from llm2vec import LLM2Vec
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/__init__.py", line 1, in <module>
from .llm2vec import LLM2Vec
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 10, in <module>
from peft import PeftModel
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/__init__.py", line 22, in <module>
from .auto import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/auto.py", line 31, in <module>
from .config import PeftConfig
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/config.py", line 23, in <module>
from .utils import CONFIG_NAME, PeftType, TaskType
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/__init__.py", line 23, in <module>
from .other import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/other.py", line 21, in <module>
import accelerate
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/__init__.py", line 16, in <module>
from .accelerator import Accelerator
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/accelerator.py", line 35, in <module>
from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/checkpointing.py", line 24, in <module>
from .utils import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/__init__.py", line 182, in <module>
from .fsdp_utils import load_fsdp_model, load_fsdp_optimizer, save_fsdp_model, save_fsdp_optimizer
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/fsdp_utils.py", line 26, in <module>
import torch.distributed.checkpoint as dist_cp
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/checkpoint/__init__.py", line 2, in <module>
from .default_planner import DefaultLoadPlanner, DefaultSavePlanner
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/checkpoint/default_planner.py", line 13, in <module>
from torch.distributed._tensor import DTensor
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/__init__.py", line 6, in <module>
import torch.distributed._tensor.ops
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/ops/__init__.py", line 2, in <module>
from .embedding_ops import * # noqa: F403
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/ops/embedding_ops.py", line 8, in <module>
import torch.distributed._functional_collectives as funcol
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_functional_collectives.py", line 12, in <module>
from . import _functional_collectives_impl as fun_col_impl
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_functional_collectives_impl.py", line 36, in <module>
from torch._dynamo import assume_constant_result
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/__init__.py", line 2, in <module>
from . import convert_frame, eval_frame, resume_execution
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 31, in <module>
from torch.fx.experimental.symbolic_shapes import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/fx/experimental/symbolic_shapes.py", line 63, in <module>
from torch.utils._sympy.functions import FloorDiv, Mod, IsNonOverlappingAndDenseIndicator
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/utils/_sympy/functions.py", line 1, in <module>
import sympy
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/sympy/__init__.py", line 30, in <module>
from sympy.core.cache import lazy_function
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/sympy/core/__init__.py", line 9, in <module>
from .expr import Expr, AtomicExpr, UnevaluatedExpr
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/sympy/core/expr.py", line 4159, in <module>
from .mul import Mul
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/sympy/core/mul.py", line 2193, in <module>
from .numbers import Rational
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/sympy/core/numbers.py", line 5, in <module>
import fractions
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/fractions.py", line 23, in <module>
_RATIONAL_FORMAT = re.compile(r"""
^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/__init__.py", line 227, in compile
return _compile(pattern, flags)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/__init__.py", line 294, in _compile
p = _compiler.compile(pattern, flags)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_compiler.py", line 745, in compile
p = _parser.parse(p, flags)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 989, in parse
p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 464, in _parse_sub
itemsappend(_parse(source, state, verbose, nested + 1,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 872, in _parse
p = _parse_sub(source, state, sub_verbose, nested + 1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 464, in _parse_sub
itemsappend(_parse(source, state, verbose, nested + 1,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 872, in _parse
p = _parse_sub(source, state, sub_verbose, nested + 1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 464, in _parse_sub
itemsappend(_parse(source, state, verbose, nested + 1,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 872, in _parse
p = _parse_sub(source, state, sub_verbose, nested + 1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 464, in _parse_sub
itemsappend(_parse(source, state, verbose, nested + 1,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 872, in _parse
p = _parse_sub(source, state, sub_verbose, nested + 1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 464, in _parse_sub
itemsappend(_parse(source, state, verbose, nested + 1,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 687, in _parse
item = subpattern[-1:]
~~~~~~~~~~^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 170, in __getitem__
return SubPattern(self.state, self.data[index])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/re/_parser.py", line 113, in __init__
def __init__(self, state, data=None):
KeyboardInterrupt
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 3, in <module>
from llm2vec import LLM2Vec
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/__init__.py", line 1, in <module>
from .llm2vec import LLM2Vec
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 10, in <module>
from peft import PeftModel
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/__init__.py", line 22, in <module>
from .auto import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/auto.py", line 31, in <module>
from .config import PeftConfig
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/config.py", line 23, in <module>
from .utils import CONFIG_NAME, PeftType, TaskType
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/__init__.py", line 23, in <module>
from .other import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/peft/utils/other.py", line 21, in <module>
import accelerate
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/__init__.py", line 16, in <module>
from .accelerator import Accelerator
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/accelerator.py", line 35, in <module>
from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/checkpointing.py", line 24, in <module>
from .utils import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/__init__.py", line 182, in <module>
from .fsdp_utils import load_fsdp_model, load_fsdp_optimizer, save_fsdp_model, save_fsdp_optimizer
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/accelerate/utils/fsdp_utils.py", line 26, in <module>
import torch.distributed.checkpoint as dist_cp
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/checkpoint/__init__.py", line 2, in <module>
from .default_planner import DefaultLoadPlanner, DefaultSavePlanner
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/checkpoint/default_planner.py", line 13, in <module>
from torch.distributed._tensor import DTensor
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/__init__.py", line 6, in <module>
import torch.distributed._tensor.ops
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/ops/__init__.py", line 2, in <module>
from .embedding_ops import * # noqa: F403
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_tensor/ops/embedding_ops.py", line 8, in <module>
import torch.distributed._functional_collectives as funcol
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_functional_collectives.py", line 12, in <module>
from . import _functional_collectives_impl as fun_col_impl
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/distributed/_functional_collectives_impl.py", line 36, in <module>
from torch._dynamo import assume_constant_result
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/__init__.py", line 2, in <module>
from . import convert_frame, eval_frame, resume_execution
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 40, in <module>
from . import config, exc, trace_rules
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/trace_rules.py", line 50, in <module>
from .variables import (
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/variables/__init__.py", line 4, in <module>
from .builtin import BuiltinVariable
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/variables/builtin.py", line 42, in <module>
from .ctx_manager import EventVariable, StreamVariable
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/variables/ctx_manager.py", line 12, in <module>
from ..device_interface import get_interface_for_device
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/_dynamo/device_interface.py", line 198, in <module>
for i in range(torch.cuda.device_count()):
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/cuda/__init__.py", line 748, in device_count
nvml_count = -1 if torch.version.hip else _device_count_nvml()
^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/cuda/__init__.py", line 709, in _device_count_nvml
raw_cnt = _raw_device_count_nvml()
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/cuda/__init__.py", line 617, in _raw_device_count_nvml
rc = nvml_h.nvmlInit()
^^^^^^^^^^^^^^^^^
KeyboardInterrupt
^CSegmentation fault (core dumped)
(test_mteb) slwanna@dracarys:~/code_projects/llm2vec_mteb/llm2vec/examples$ Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
prepare(preparation_data)
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/examples/retrieval_scidocs.py", line 3, in <module>
from llm2vec import LLM2Vec
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/__init__.py", line 1, in <module>
from .llm2vec import LLM2Vec
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/llm2vec.py", line 21, in <module>
from .models import (
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/models/__init__.py", line 1, in <module>
from .bidirectional_mistral import MistralBiModel, MistralBiForMNTP
File "/home/slwanna/code_projects/llm2vec_mteb/llm2vec/llm2vec/models/bidirectional_mistral.py", line 4, in <module>
from transformers import (
File "<frozen importlib._bootstrap>", line 1229, in _handle_fromlist
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1501, in __getattr__
value = getattr(module, name)
^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1500, in __getattr__
module = self._get_module(self._class_to_module[name])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1510, in _get_module
return importlib.import_module("." + module_name, self.__name__)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/models/mistral/modeling_mistral.py", line 48, in <module>
if is_flash_attn_2_available():
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 759, in is_flash_attn_2_available
if not torch.cuda.is_available():
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/slwanna/miniconda3/envs/test_mteb/lib/python3.11/site-packages/torch/cuda/__init__.py", line 118, in is_available
return torch._C._cuda_getDeviceCount() > 0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
Hi @SouLeo,
Apologies for delay in the response.
When you run
$ python test_sentence_transformers.py
I believe only one GPU is being used, as sentence transformers have a different command for multi-GPU.
Regarding running multi-GPU with LLM2Vec, the code need to be shielded with if __name__
. Otherwise, CUDA runs into issues when spawning new processes. This is a requirement in sentence transformers multi-GPU support as well.
I have modified your script and verified that it runs on 8xH100 server.
import datasets
import torch
from llm2vec import LLM2Vec
# from beir import util
# from beir.datasets.data_loader import GenericDataLoader as BeirDataLoader
import os
from typing import Dict, List
# from beir.retrieval.evaluation import EvaluateRetrieval
def append_instruction(instruction, sentences):
new_sentences = []
for s in sentences:
new_sentences.append([instruction, s, 0])
return new_sentences
def cos_sim(a: torch.Tensor, b: torch.Tensor):
if not isinstance(a, torch.Tensor):
a = torch.tensor(a)
if not isinstance(b, torch.Tensor):
b = torch.tensor(b)
if len(a.shape) == 1:
a = a.unsqueeze(0)
if len(b.shape) == 1:
b = b.unsqueeze(0)
a_norm = torch.nn.functional.normalize(a, p=2, dim=1)
b_norm = torch.nn.functional.normalize(b, p=2, dim=1)
return torch.mm(a_norm, b_norm.transpose(0, 1))
def encode_queries(queries: List[str], batch_size: int, **kwargs):
new_sentences = append_instruction(instruction, queries)
kwargs['show_progress_bar'] = False
return model.encode(new_sentences, batch_size=batch_size, **kwargs)
def encode_corpus(corpus: List[Dict[str, str]], batch_size: int, **kwargs):
if type(corpus) is dict:
sentences = [
(corpus["title"][i] + ' ' + corpus["text"][i]).strip()
if "title" in corpus
else corpus["text"][i].strip()
for i in range(len(corpus["text"]))
]
else:
sentences = [
(doc["title"] + ' ' + doc["text"]).strip() if "title" in doc else doc["text"].strip()
for doc in corpus
]
new_sentences = append_instruction("", sentences)
return model.encode(new_sentences, batch_size=batch_size, **kwargs)
if __name__ == "__main__":
dataset_name = "mteb/scidocs"
instruction = "Given a scientific paper title, retrieve paper abstracts that are cited by the given paper: "
print("Loading dataset...")
queries = datasets.load_dataset(dataset_name, "queries")
corpus = datasets.load_dataset(dataset_name, "corpus")
batch_size = 2
print("Loading model...")
model = LLM2Vec.from_pretrained(
"McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp",
peft_model_name_or_path="McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-supervised",
device_map="cuda" if torch.cuda.is_available() else "cpu",
attn_implementation="flash_attention_2",
torch_dtype=torch.bfloat16,
)
print("Encoding Queries...")
query_ids = list(queries.keys())
results = {qid: {} for qid in query_ids}
queries = [queries[qid] for qid in queries]
query_embeddings = encode_queries(queries[0]['text'][:2], batch_size=batch_size, show_progress_bar=True, convert_to_tensor=True)
Please check if this script is working on your end, and feel free to ask any other question.
@vaibhavad I took a little snippet about how when I specify two Gpus to run, the model loads repeatedly.
logs:
Traceback (most recent call last):
File "
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s] Loading checkpoint shards: 33%|███▎ | 1/3 [00:01<00:03, 1.90s/it] Loading checkpoint shards: 67%|██████▋ | 2/3 [00:03<00:01, 1.80s/it]1 Start loading model
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s] Loading checkpoint shards: 100%|██████████| 3/3 [00:05<00:00, 1.68s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:05<00:00, 1.72s/it]1
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s] Model and configuration loaded successfully!
0%| | 0/1 [00:00<?, ?it/s] 100%|██████████| 1/1 [00:00<00:00, 24966.10it/s] Loading checkpoint shards: 33%|███▎ | 1/3 [00:03<00:07, 3.87s/it] Loading checkpoint shards: 33%|███▎ | 1/3 [00:03<00:07, 3.74s/it] Loading checkpoint shards: 67%|██████▋ | 2/3 [00:07<00:03, 3.64s/it] Loading checkpoint shards: 67%|██████▋ | 2/3 [00:07<00:03, 3.56s/it]1
Start loading model Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s] Loading checkpoint shards: 100%|██████████| 3/3 [00:10<00:00, 3.40s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:10<00:00, 3.48s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:10<00:00, 3.35s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:10<00:00, 3.42s/it] Loading checkpoint shards: 33%|███▎ | 1/3 [00:03<00:07, 3.76s/it]
Hello @Georgepitt,
Can you share the test_example.py
file that you are using?
Of course! Here is test_example.py
and my task submission script. I run test_example.py in the form of a task submission such as sbatch emxaple.sh
. The above error occurs when the number of Gpus(#SBATCH -G
) specified by 'sbatch emxaple.sh' is greater than 1.
emxample.sh
#!/bin/bash
#SBATCH -p gpu_se
#SBATCH -n 1
#SBATCH -G 2
#SBATCH -o /share/home/chenyuxuan/Research_CodeSearch/llm2v/llm2vec_test/run_out/job_exmaple.out
~/.conda/envs/LLM2Vec/bin/python /share/home/llm2v/llm2vec_test/test_example.py
test_example.py
import json
import torch
from llm2vec import LLM2Vec
import os
import numpy as np
os.environ["HF_HUB_OFFLINE"] = "1"
os.environ["HF_HOME"] = "/share/home/chenyuxuan/.cache/huggingface/hub"
os.environ["TOKENIZERS_PARALLELISM"] = "false"
path ='/share/home/.cache/huggingface/hub/models--McGill-NLP--LLM2Vec-Mistral-7B-Instruct-v2-mntp/snapshots/5ec8e6444af63627e7609f38641de612c6de0105'
path2 = "/share/home/.cache/huggingface/hub/models--McGill-NLP--LLM2Vec-Mistral-7B-Instruct-v2-mntp-unsup-simcse/snapshots/2c055a5d77126c0d3dc6cd8ffa30e2908f4f45f8"
print("Start loading model")
l2v = LLM2Vec.from_pretrained(
path,
peft_model_name_or_path=path2,
device_map="cuda" if torch.cuda.is_available() else "cpu",
torch_dtype=torch.bfloat16,
)
print("Model and configuration loaded successfully!")
# Encoding queries using instructions
instruction = (
"Given a web search query, retrieve relevant passages that answer the query:"
)
queries = [
[instruction, "how much protein should a female eat"],
[instruction, "summit define"],
]
q_reps = l2v.encode(queries)
# Encoding documents. Instruction are not required for documents
documents = [
"As a general guideline, the CDC's average requirement of protein for women ages 19 to 70 is 46 grams per day. But, as you can see from this chart, you'll need to increase that if you're expecting or training for a marathon. Check out the chart below to see how much protein you should be eating each day.",
"Definition of summit for English Language Learners. : 1 the highest point of a mountain : the top of a mountain. : 2 the highest level. : 3 a meeting or series of meetings between the leaders of two or more governments.",
]
d_reps = l2v.encode(documents)
# Compute cosine similarity
q_reps_norm = torch.nn.functional.normalize(q_reps, p=2, dim=1)
d_reps_norm = torch.nn.functional.normalize(d_reps, p=2, dim=1)
cos_sim = torch.mm(q_reps_norm, d_reps_norm.transpose(0, 1))
print(cos_sim)
"""
tensor([[0.6470, 0.1619],
[0.0786, 0.5844]])
"""
Hi @Georgepitt, please refer to my response above
Regarding running multi-GPU with LLM2Vec, the code need to be shielded with
if __name__
. Otherwise, CUDA runs into issues when spawning new processes. This is a requirement in sentence transformers multi-GPU support as well.
You'll need to modify test_example.py
accordingly. Let me know if you have any more questions
Thank you very much for your help@vaibhavad! I successfully run the project locally. In fact, the really important Settings here are in the adapter_config.json file, which contains the two fields "base_model_name_or_path" and "parent_library", if you map them locally, then the model will run.
This is my setup: import os os.environ["HF_HUB_OFFLINE"] = "1" os.environ["HF_HOME"] = "
" os.environ["TOKENIZERS_PARALLELISM"] = "false" One unusual thing is that I can only specify one gpu to run, otherwise it will load the model repeatedly in the l2v.encode section.
I am working on the inference demo locally, and I run into the error huggingface_hub.errors.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name'. Did you see the same error and resolve it? What I have done after downloading the model is modifying the base_model_name_or_path to my local dir. (I didnt modify the parent_library since I am not sure how), and mv the original xxxx.safetensor to model.safetensor to avoid another issue.
In fact, the really important Settings here are in the adapter_config.json file, which contains the two fields "base_model_name_or_path" and "parent_library", if you map them locally, then the model will run.
I'm a bit confused about how to "map locally" as you mentioned. Could you give an example of how to modify the "base_model_name_or_path" and "parent_library" fields in the adapter_config.json file? Thanks a lot!
Suppose I had Llama3 weights downloaded previously is there any way I can use that for this library?
@saikot-paul - Yes, llm2vec adapters will automatically apply on top of downloaded Llama 3 if you follow the model loading instructions described here.
Closing as it is stale. Feel free to re-open if you have any more questions.
Hello, the computing cluster provided by the lab needs to run offline. But the code in usage needs to be networked. I have changed the code to the offline version, but it still gives errors. Can you give me some help, please?
usage code:
Loading base Mistral model, along with custom code that enables bidirectional connections in decoder-only LLMs.
tokenizer = AutoTokenizer.from_pretrained( "McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp" ) config = AutoConfig.from_pretrained( "McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp", trust_remote_code=True ) model = AutoModel.from_pretrained( "McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp", trust_remote_code=True, config=config, torch_dtype=torch.bfloat16, device_map="cuda" if torch.cuda.is_available() else "cpu", )
Loading MNTP (Masked Next Token Prediction) model.
model = PeftModel.from_pretrained( model, "McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp", )
Modified code: local_base_model_path = "/home/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp" tokenizer = AutoTokenizer.from_pretrained(local_base_model_path) config = AutoConfig.from_pretrained(local_base_model_path) model = AutoModel.from_pretrained(local_base_model_path, config=config,torch_dtype=torch.bfloat16, local_files_only=True) print(4)
errors: Traceback (most recent call last): File "/share/home/chenyuxuan/Llama3_8b_s.py", line 59, in
model = AutoModel.from_pretrained(local_model_path, config=config,torch_dtype=torch.bfloat16, local_files_only=True)
File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
return model_class.from_pretrained(
File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3385, in from_pretrained
if has_file(pretrained_model_name_or_path, TF2_WEIGHTS_NAME, has_file_kwargs):
File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/transformers/utils/hub.py", line 627, in has_file
r = requests.head(url, headers=headers, allow_redirects=False, proxies=proxies, timeout=10)
File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/requests/api.py", line 100, in head
return request("head", url, kwargs)
File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, kwargs)
File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, send_kwargs)
File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/share/home/chenyuxuan/.conda/envs/LLM2Vec/lib/python3.8/site-packages/requests/adapters.py", line 532, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)