Closed MDFARHYN closed 1 year ago
Same error me also.
I got the same error when running in wsl ubuntu $ uname -a Linux DESKTOP-40049K6 5.15.90.1-microsoft-standard-WSL2 #1 SMP Fri Jan 27 02:56:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
I got the same error when running in wsl ubuntu $ uname -a Linux D2 5.15.90.1-microsoft-standard-WSL2 #1 SMP Fri Jan 27 02:56:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Same error here
Same error here on Colab
I have solved it with a cpu installation by installing this : https://github.com/krychu/llama
instead of https://github.com/facebookresearch/llama
Complete process to install :
https://github.com/facebookresearch/llama
and extract it to a llama-main
folderhttps://github.com/krychu/llama
and extract it and replace files in the llama-main
folderdownload.sh
script in a terminal, passing the URL provided when prompted to start the downloadllama-main
folderpython3 -m venv env
and activate it : source env/bin/activate
python3 -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu #pour la version cpu
python3 -m pip install -e .
torchrun --nproc_per_node 1 example_text_completion.py \
--ckpt_dir llama-2-7b/ \
--tokenizer_path tokenizer.model \
--max_seq_len 128 --max_batch_size 1 #(instead of 4)
I have solved it with a cpu installation by installing this :
https://github.com/krychu/llama
instead ofhttps://github.com/facebookresearch/llama
Complete process to install :1. download the original version of Llama from : `https://github.com/facebookresearch/llama` and extract it to a `llama-main` folder 2. download th cpu version from : `https://github.com/krychu/llama` and extract it and replace files in the `llama-main` folder 3. run the `download.sh` script in a terminal, passing the URL provided when prompted to start the download 4. go to the `llama-main` folder 5. cretate an Python3 env : `python3 -m venv env` and activate it : `source env/bin/activate` 6. install the cpu version of pytorch : `python3 -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu #pour la version cpu` 7. install dependencies off llama : `python3 -m pip install -e .` 8. run if you have downloaded llama-2-7b :
torchrun --nproc_per_node 1 example_text_completion.py \ --ckpt_dir llama-2-7b/ \ --tokenizer_path tokenizer.model \ --max_seq_len 128 --max_batch_size 1 #(instead of 4)
Nice!!! But is there no way to use it on gpu? my best guess is there might be a problem with latest version of torchvision
I have solved it with a cpu installation by installing this :
https://github.com/krychu/llama
instead ofhttps://github.com/facebookresearch/llama
Complete process to install :1. download the original version of Llama from : `https://github.com/facebookresearch/llama` and extract it to a `llama-main` folder 2. download th cpu version from : `https://github.com/krychu/llama` and extract it and replace files in the `llama-main` folder 3. run the `download.sh` script in a terminal, passing the URL provided when prompted to start the download 4. go to the `llama-main` folder 5. cretate an Python3 env : `python3 -m venv env` and activate it : `source env/bin/activate` 6. install the cpu version of pytorch : `python3 -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu #pour la version cpu` 7. install dependencies off llama : `python3 -m pip install -e .` 8. run if you have downloaded llama-2-7b :
torchrun --nproc_per_node 1 example_text_completion.py \ --ckpt_dir llama-2-7b/ \ --tokenizer_path tokenizer.model \ --max_seq_len 128 --max_batch_size 1 #(instead of 4)
Nice!!! But is there no way to use it on gpu? my best guess is there might be a problem with latest version of torchvision
I'm not an expert for pytorch, I don't know what was the problem. Wait and see for facebook to react. I have used pytorch about 10 years before and it was a small librarie. Today I don't understand nothing of what I do with it lol
I'm getting the same issue on Apple M1 Max
pzim-devdata Thanks a lot, it's working. I have a few questions 1) it taking too much time for generating a response? how to reduce the time? my pc configuration is 16GB RAM core i5 12th gen preocessor and 2) what is difference bewteen llama-2-7b, llama-2-7b-chat, llama-2-13b and llama-2-13b-chat? 3) what is max_batch_size ? what is temperature? what is token ?
Yes it's very long. This solution is just for trying llama. You will need to use it with your GPU when the bug will be fixed. Cuda works just for Nvidia video card. If you have an AMD or Intel video card you have to install pytorch with ROCm but I don't know if Lllama is working with ROCm. The difference between llama-2-7b and llama-2-7b-chat is llama-2-7b will just finishing the sentence in the prompt and the chat version is a question/answer version with infinite prompts. 7B works with 1 GPU card 13B works with minimum 2 GPU cards 70B works with minimum 8 With your configuration the best solution is to go the website of Llama for playing with it : https://chat.lmsys.org/
Thanks
Still getting this error as well. WSL2 with a 3090 (not interested in running CPU only, interested in it running on the 3090)
Same error with RedHat, a single V100 GPU with > 300G RAM. Any solution?
If anyone is still facing an issue, do one of the following
Add @record
over the main function & it will give you a proper traceback
from torch.distributed.elastic.multiprocessing.errors import record
@record
def main(...)
or
go to /usr/log/kern.log
& check the message on last line, it will show you if its because of insufficient VRAM
If anyone is still facing an issue, do one of the following
Add
@record
over the main function & it will give you a proper tracebackfrom torch.distributed.elastic.multiprocessing.errors import record @record def main(...)
or
go to
/usr/log/kern.log
& check the message on last line, it will show you if its because of insufficient VRAM
TLDR: try changing batch size from 4 to any number greater than 4. Changing 4 to 6 worked for me.
I was having the same error message. torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
. When I tried Rahul's method of adding @record
. Turns out I was having an assertion error due to batch size.
File "/home/soma1/docs/mine/llama/llama/generation.py", line 117, in generate
assert bsz <= params.max_batch_size, (bsz, params.max_batch_size)
AssertionError: (6, 4)
So I tried the following, and it worked without any problem!
torchrun --nproc_per_node 1 example_text_completion.py \
--ckpt_dir llama-2-7b/ \
--tokenizer_path tokenizer.model \
--max_seq_len 128 --max_batch_size 6 #(instead of 4)
same error, change max_batch_size to any number did not help. Using Windows 11 with 32 GB RAM and RTX3090 with 24 GB VRAM. Tried different versions of CUDA and PyTorch did also not help. Any other ideas? Here is my error:
`(llama2env) PS Y:\231125 LLAMA2\llama-main> torchrun --nproc_per_node 1 example_chat_completion.py --ckpt_dir ..\llama-2-7b-chat\ --tokenizer_path tokenizer.model --max_seq_len 512 --max_batch_size 4 [2023-11-27 20:35:09,370] torch.distributed.elastic.multiprocessing.redirects: [WARNING] NOTE: Redirects are currently not supported in Windows or MacOs. [W socket.cpp:663] [c10d] The client socket has failed to connect to [TROG2020]:29500 (system error: 10049 - Die angeforderte Adresse ist in diesem Kontext ung³ltig.). [W socket.cpp:663] [c10d] The client socket has failed to connect to [TROG2020]:29500 (system error: 10049 - Die angeforderte Adresse ist in diesem Kontext ung³ltig.).
initializing model parallel with size 1 initializing ddp with size 1 initializing pipeline with size 1 Traceback (most recent call last): File "Y:\231125 LLAMA2\llama-main\example_chat_completion.py", line 106, in
fire.Fire(main) File "Y:\231125 LLAMA2\llama2env\Lib\site-packages\fire\core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "Y:\231125 LLAMA2\llama2env\Lib\site-packages\fire\core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( ^^^^^^^^^^^^^^^^^^^^ File "Y:\231125 LLAMA2\llama2env\Lib\site-packages\fire\core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^ File "Y:\231125 LLAMA2\llama-main\example_chat_completion.py", line 37, in main generator = Llama.build( ^^^^^^^^^^^^ File "Y:\231125 LLAMA2\llama-main\llama\generation.py", line 116, in build tokenizer = Tokenizer(model_path=tokenizer_path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "Y:\231125 LLAMA2\llama-main\llama\tokenizer.py", line 24, in init assert os.path.isfile(model_path), model_path AssertionError: tokenizer.model [2023-11-27 20:35:19,398] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 20040) of binary: Y:\231125 LLAMA2\llama2env\Scripts\python.exe Traceback (most recent call last): File " ", line 198, in _run_module_as_main File " args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "Y:\231125 LLAMA2\llama2env\Lib\site-packages\torch\distributed\run.py", line 806, in main run(args) File "Y:\231125 LLAMA2\llama2env\Lib\site-packages\torch\distributed\run.py", line 797, in run elastic_launch( File "Y:\231125 LLAMA2\llama2env\Lib\site-packages\torch\distributed\launcher\api.py", line 134, in call return launch_agent(self._config, self._entrypoint, list(args)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "Y:\231125 LLAMA2\llama2env\Lib\site-packages\torch\distributed\launcher\api.py", line 264, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:", line 88, in _run_code File "Y:\231125 LLAMA2\llama2env\Scripts\torchrun.exe__main.py", line 7, in File "Y:\231125 LLAMA2\llama2env\Lib\site-packages\torch\distributed\elastic\multiprocessing\errors\ init__.py", line 346, in wrapper return f(example_chat_completion.py FAILED
Failures:
------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2023-11-27_20:35:19 host : XXX rank : 0 (local_rank: 0) exitcode : 1 (pid: 20040) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================`
I've been running llama and other models through ooba and haven't been using this anymore, ooba works fine.
If anyone is still facing an issue, do one of the following
Add
@record
over the main function & it will give you a proper tracebackfrom torch.distributed.elastic.multiprocessing.errors import record @record def main(...)
or
go to
/usr/log/kern.log
& check the message on last line, it will show you if its because of insufficient VRAM
Nice answer, I met this error exactly because of CPU OOM.
I found, when I deleted all '\' in the command 'torchrun --nproc_per_node 1 example_text_completion.py \ --ckpt_dir llama-2-7b/ \ --tokenizer_path tokenizer.model \ --max_seq_len 128 --max_batch_size 4' , the error has gone.
I downloaded the llama-2-7b and run the command as they metioned
but got this error