Closed zenkraker closed 1 year ago
same error here
I think the Python error is just a warning. Regardless, here's my complete log. Says it starts up and then the container stops.
/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "modelalias" has conflict with >protected namespace "model".
You may be able to resolve this warning by setting
model_config['protected_namespaces'] = ('settings_',)
. warnings.warn( llama.cpp: loading model from /models/llama-2-13b-chat.bin /models/llama-2-13b-chat.bin model found. Initializing server with: Batch size: 2096 Number of CPU threads: 24 Context window: 4096ai-chatbot-starter@0.1.0 start next start
ready - started server on 0.0.0.0:3000, url: http://localhost:3000
The model path in the unraid docker settings is still wrong:
Download URL: https://huggingface.co/TheBloke/Nous-Hermes-Llama-2-7B-GGML/resolve/main/nous-hermes-llama-2-7b.ggmlv3.q4_0.bin
Local Model Path: /models/llama-2-7b.bin
But even when these are correct, the docker still exits.
I'm getting cuda out of memory errors when trying to load the CUDA version:
CUDA error 2 at /tmp/pip-install-p6rdknxd/llama-cpp-python_6cf58cabdb694a60bef40fdd1b9f4b5f/vendor/llama.cpp/ggml-cuda.cu:6301: out of memory
I get this with the 16b and 7b files. Server has 128GB of RAM.
I'm getting cuda out of memory errors when trying to load the CUDA version:
CUDA error 2 at /tmp/pip-install-p6rdknxd/llama-cpp-python_6cf58cabdb694a60bef40fdd1b9f4b5f/vendor/llama.cpp/ggml-cuda.cu:6301: out of memory
I get this with the 16b and 7b files. Server has 128GB of RAM.
Server memory makes no difference with cuda. Look at the container log and you will see how much vram is needed vs available. I only have a 2gig p400 which wasn't enough but I'm also running CP.AI which is using 1400MB.
Same issue with it just exiting after model is downloaded.
same here just installed in unraid 6.12.4, the docker stops right after it started.
/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model_".
You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
warnings.warn(
llama.cpp: loading model from /models/llama-2-7b-chat.bin
/models/llama-2-7b-chat.bin model found.
Initializing server with:
Batch size: 2096
Number of CPU threads: 12
Context window: 4096
> ai-chatbot-starter@0.1.0 start
> next start
ready - started server on 0.0.0.0:3000, url: http://localhost:3000
I'm getting cuda out of memory errors when trying to load the CUDA version: CUDA error 2 at /tmp/pip-install-p6rdknxd/llama-cpp-python_6cf58cabdb694a60bef40fdd1b9f4b5f/vendor/llama.cpp/ggml-cuda.cu:6301: out of memory I get this with the 16b and 7b files. Server has 128GB of RAM.
Server memory makes no difference with cuda. Look at the container log and you will see how much vram is needed vs available. I only have a 2gig p400 which wasn't enough but I'm also running CP.AI which is using 1400MB.
I'm running an RTX A2000 12GB so if it's running completely GPU I would at least think the 7b would start. I'll try switching to CPU and see if that makes a difference.
do we need to set up in docker template the gpu settings?
--runtime=nvidia
Container Variable: NVIDIA_VISIBLE_DEVICES Container Variable: NVIDIA_DRIVER_CAPABILITIES
Container won't start, but no error logs are shown... CUDA Version 12.2.0
Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License. By pulling and using the container, you accept the terms and conditions of this license: https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
/models/llama-2-7b-chat.bin model found. Initializing server with: Batch size: 2096 Number of CPU threads: 24 Context window: 4096
Swapped for CPU version and still not starting. Same errors about the namespace and such.
I’m using the CPU version as well. Just ends after image downloads. Sent from my iPhoneOn Sep 2, 2023, at 5:52 PM, tezgno @.***> wrote: Swapped for CPU version and still not starting
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>
What CPUs do you people have?
The error is illegal instruction
, which to me seems like the CPU doesn't support specific instruction sets that this expects. I believe it currently expects AVX2 as a minimum.
What CPUs do you people have?
The error is
illegal instruction
, which to me seems like the CPU doesn't support specific instruction sets that this expects. I believe it currently expects AVX2 as a minimum.
I’m running on an AMD Ryzen 7 3700X, which is a Zen 2 architecture. Zen 2 supports AVX and AVX2.
What CPUs do you people have?
i9 10900 here
What CPUs do you people have?
AMD Ryzen 5 5600
i5-8500
The model path in the unraid docker settings is still wrong:
Download URL: https://huggingface.co/TheBloke/Nous-Hermes-Llama-2-7B-GGML/resolve/main/nous-hermes-llama-2-7b.ggmlv3.q4_0.bin
Local Model Path: /models/llama-2-7b.bin
But even when these are correct, the docker still exits.
Yes I noticed that I had to fix the path of the URL, but yeah after I fixed the path and downloaded the file manually, still the container does not start. CPU: Ryzen 2700X
i7-7700K
My docker will run, but when it starts it times out and says I do not have an openAI api linked. tells me to input in bottom left, but nowhere to be found
My docker will run, but when it starts it times out and says I do not have an openAI api linked. tells me to input in bottom left, but nowhere to be found
nevermind, I saw the logs and saw out of vram. changed the GPU layers under more settings in setup to 32 instead of 64 on a 2060.
So, I was able to get the CUDA version to work after cutting the number from 64 to 32. Looks like it’s using almost 4GB of vram in that configuration. Not sure why 64 doesn’t work on my card, but ok.
CPU version doesn’t work at all. It’s giving an error that complains about the CPU, but all of the ones listed here support AVX2. Can we confirm that the CPU version is for x86_64 or AMD64? I’m thinking it may be the wrong architecture.
Hi Guys, same problem here, cannot start after the model download. Also set it to 32 and nothing.
My CPU: Intel Xeon E5-2695 v2
i have an Itel CPU i3 8100 which seems to have AVX / AVX2 also i use an Nvidia GT 1030 on my NAS with 2gb vram.
it also failed , but when i set the GPU layers from 64 to 16 it seems to work now
Same issue here. Looks like its an issue from the pydantic library. Seen here and here. @uogbuji then replies and says the newer release fixed the issue.
I have no idea how to do this but comparing 0.1.69 to 0.1.70 for abetlen/llama-cpp-python might reveal a fix
That is the issue around the model warning error that you see inside of the docker logs, but that isn't what causes the docker container to not start. The container isn't starting due to EXIT Code 132 (you'll notice this on the right when the container shuts down). A Docker Exit Code 132 means that the docker container itself (not the app running inside of it) received an Illegal Instruction to the CPU. This is normally caused when your CPU doesn't contain the instruction sets that the docker is written for (usually an AVX, AVX2 or SSE instruction set). However, based on the CPU's that people have mentioned here, all of them would support AVX, AVX2, and SSE. My theory in this is that the docker container may be written for a different CPU architecture all together (such as AMD64_AVX, which isn't necessarily the same as AMD64 with AVX instructions).
Or just simply requires AVX512.
Or just simply requires AVX512.
That could be. Unfortunately it's difficult to see which at this point.
Is this issue relevant?
When I try to start the Docker i get the Docker Exit Code 132 and this in the unraid log: "traps: python3[15228] trap invalid opcode ip:14d65822c872 sp:7fffc70fbc50 error:0 in libllama.so[14d658211000+62000]"
CPU: i9-9900K
I'm getting cuda out of memory errors when trying to load the CUDA version:
CUDA error 2 at /tmp/pip-install-p6rdknxd/llama-cpp-python_6cf58cabdb694a60bef40fdd1b9f4b5f/vendor/llama.cpp/ggml-cuda.cu:6301: out of memory
I get this with the 16b and 7b files. Server has 128GB of RAM.
Can you lower the of number of GPU layers under the Show More Settings? It's set to 64 by default.
Same issue here. Looks like its an issue from the pydantic library. Seen here and here. @uogbuji then replies and says the newer release fixed the issue. I have no idea how to do this but comparing 0.1.69 to 0.1.70 for abetlen/llama-cpp-python might reveal a fix
That is the issue around the model warning error that you see inside of the docker logs, but that isn't what causes the docker container to not start. The container isn't starting due to EXIT Code 132 (you'll notice this on the right when the container shuts down). A Docker Exit Code 132 means that the docker container itself (not the app running inside of it) received an Illegal Instruction to the CPU. This is normally caused when your CPU doesn't contain the instruction sets that the docker is written for (usually an AVX, AVX2 or SSE instruction set). However, based on the CPU's that people have mentioned here, all of them would support AVX, AVX2, and SSE. My theory in this is that the docker container may be written for a different CPU architecture all together (such as AMD64_AVX, which isn't necessarily the same as AMD64 with AVX instructions).
The CPU version is having this issue. Everything works as expected using the Docker image I built locally using Ubuntu but not on the Github version. Here's my script:
- run: docker buildx create --use
- run: docker buildx build --platform linux/amd64 -f Dockerfile-${{ matrix.type }} --tag $IMAGE_NAME:${{ github.ref_name }} --push .
- run: docker buildx build --platform linux/amd64 -f Dockerfile-${{ matrix.type }} --tag $IMAGE_NAME:latest --push .
I'm getting cuda out of memory errors when trying to load the CUDA version: CUDA error 2 at /tmp/pip-install-p6rdknxd/llama-cpp-python_6cf58cabdb694a60bef40fdd1b9f4b5f/vendor/llama.cpp/ggml-cuda.cu:6301: out of memory I get this with the 16b and 7b files. Server has 128GB of RAM.
Can you lower the of number of GPU layers under the Show More Settings? It's set to 64 by default.
I did and when lowered it does work. CPU version does not work.
I'm getting cuda out of memory errors when trying to load the CUDA version: CUDA error 2 at /tmp/pip-install-p6rdknxd/llama-cpp-python_6cf58cabdb694a60bef40fdd1b9f4b5f/vendor/llama.cpp/ggml-cuda.cu:6301: out of memory I get this with the 16b and 7b files. Server has 128GB of RAM.
Can you lower the of number of GPU layers under the Show More Settings? It's set to 64 by default.
Also by reducing to 1 does not work in my case
RTX 2080 ti
Same issue here. Looks like its an issue from the pydantic library. Seen here and here. @uogbuji then replies and says the newer release fixed the issue. I have no idea how to do this but comparing 0.1.69 to 0.1.70 for abetlen/llama-cpp-python might reveal a fix
That is the issue around the model warning error that you see inside of the docker logs, but that isn't what causes the docker container to not start. The container isn't starting due to EXIT Code 132 (you'll notice this on the right when the container shuts down). A Docker Exit Code 132 means that the docker container itself (not the app running inside of it) received an Illegal Instruction to the CPU. This is normally caused when your CPU doesn't contain the instruction sets that the docker is written for (usually an AVX, AVX2 or SSE instruction set). However, based on the CPU's that people have mentioned here, all of them would support AVX, AVX2, and SSE. My theory in this is that the docker container may be written for a different CPU architecture all together (such as AMD64_AVX, which isn't necessarily the same as AMD64 with AVX instructions).
The latest build fixes the EXIT Code 132. I added libopenblas-dev
and things appear to be working as expected.
how can i get an error log? the container does not start and shows no indication, where the error could occur.
Only the cuda shows information
`CUDA Version 12.2.0
Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License. By pulling and using the container, you accept the terms and conditions of this license: https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
/models/llama-2-7b.bin model found. Initializing server with: Batch size: 2096 Number of CPU threads: 24 Context window: 4096
Press ANY KEY to close this window `
i7-7700K
The latest version of the image worked for me on CPU!
i7-7700K
The latest version of the image worked for me on CPU!
Same for me, very slow and the results were quite bad. Uninstalled again :/
very slow
I'm not sure what you expected
Yeah, CPU is slow tbh.
I stumbled upon this tonight and with the unraid template for cuda this is still an issue. Seeing the same @horphi0815 up there. Don't see anywhere that error logs would be stored but I did pull the repo and run it with docker compose and still don't see any logs find . -iname "*log*"
Hi, I get this error while starting the docker container in Unraid:
I uploaded the model that you passed as URL in the model folder so I have it locally. But I laways get this error
/models/llama-2-13b-chat.bin model found. stdout 02/09/2023 12:08:53 Initializing server with: stdout 02/09/2023 12:08:53 Batch size: 2096 stdout 02/09/2023 12:08:53 Number of CPU threads: 16 stdout 02/09/2023 12:08:53 Context window: 4096 stderr 02/09/2023 12:08:58 /usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "modelalias" has conflict with protected namespace "model". stderr 02/09/2023 12:08:58 stderr 02/09/2023 12:08:58 You may be able to resolve this warning by setting
model_config['protected_namespaces'] = ('settings_',)
. stderr 02/09/2023 12:08:58 warnings.warn( stderr 02/09/2023 12:08:58 llama.cpp: loading model from /models/llama-2-13b-chat.bin stdout 02/09/2023 12:09:02 /models/llama-2-13b-chat.bin model found. stdout 02/09/2023 12:09:02 Initializing server with: stdout 02/09/2023 12:09:02 Batch size: 2096 stdout 02/09/2023 12:09:02 Number of CPU threads: 16 stdout 02/09/2023 12:09:02 Context window: 4096 stderr 02/09/2023 12:09:04 /usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "modelalias" has conflict with protected namespace "model". stderr 02/09/2023 12:09:04 stderr 02/09/2023 12:09:04 You may be able to resolve this warning by settingmodel_config['protected_namespaces'] = ('settings_',)
.