edgar971 / open-chat

A self-hosted, offline, ChatGPT-like chatbot with different LLM support. 100% private, with no data leaving your device.
MIT License
64 stars 8 forks source link

Start in Unraid not working #3

Closed zenkraker closed 10 months ago

zenkraker commented 10 months ago

Hi, I get this error while starting the docker container in Unraid:

I uploaded the model that you passed as URL in the model folder so I have it locally. But I laways get this error

/models/llama-2-13b-chat.bin model found. stdout 02/09/2023 12:08:53 Initializing server with: stdout 02/09/2023 12:08:53 Batch size: 2096 stdout 02/09/2023 12:08:53 Number of CPU threads: 16 stdout 02/09/2023 12:08:53 Context window: 4096 stderr 02/09/2023 12:08:58 /usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "modelalias" has conflict with protected namespace "model". stderr 02/09/2023 12:08:58 stderr 02/09/2023 12:08:58 You may be able to resolve this warning by setting model_config['protected_namespaces'] = ('settings_',). stderr 02/09/2023 12:08:58 warnings.warn( stderr 02/09/2023 12:08:58 llama.cpp: loading model from /models/llama-2-13b-chat.bin stdout 02/09/2023 12:09:02 /models/llama-2-13b-chat.bin model found. stdout 02/09/2023 12:09:02 Initializing server with: stdout 02/09/2023 12:09:02 Batch size: 2096 stdout 02/09/2023 12:09:02 Number of CPU threads: 16 stdout 02/09/2023 12:09:02 Context window: 4096 stderr 02/09/2023 12:09:04 /usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "modelalias" has conflict with protected namespace "model". stderr 02/09/2023 12:09:04 stderr 02/09/2023 12:09:04 You may be able to resolve this warning by setting model_config['protected_namespaces'] = ('settings_',).

johnny2678 commented 10 months ago

same error here

jimserio commented 10 months ago

I think the Python error is just a warning. Regardless, here's my complete log. Says it starts up and then the container stops.

/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "modelalias" has conflict with >protected namespace "model".

You may be able to resolve this warning by setting model_config['protected_namespaces'] = ('settings_',). warnings.warn( llama.cpp: loading model from /models/llama-2-13b-chat.bin /models/llama-2-13b-chat.bin model found. Initializing server with: Batch size: 2096 Number of CPU threads: 24 Context window: 4096

ai-chatbot-starter@0.1.0 start next start

ready - started server on 0.0.0.0:3000, url: http://localhost:3000

Dustinhoefer commented 10 months ago

The model path in the unraid docker settings is still wrong:

Download URL: https://huggingface.co/TheBloke/Nous-Hermes-Llama-2-7B-GGML/resolve/main/nous-hermes-llama-2-7b.ggmlv3.q4_0.bin

Local Model Path: /models/llama-2-7b.bin

But even when these are correct, the docker still exits.

tezgno commented 10 months ago

I'm getting cuda out of memory errors when trying to load the CUDA version:

CUDA error 2 at /tmp/pip-install-p6rdknxd/llama-cpp-python_6cf58cabdb694a60bef40fdd1b9f4b5f/vendor/llama.cpp/ggml-cuda.cu:6301: out of memory

I get this with the 16b and 7b files. Server has 128GB of RAM.

jimserio commented 10 months ago

I'm getting cuda out of memory errors when trying to load the CUDA version:

CUDA error 2 at /tmp/pip-install-p6rdknxd/llama-cpp-python_6cf58cabdb694a60bef40fdd1b9f4b5f/vendor/llama.cpp/ggml-cuda.cu:6301: out of memory

I get this with the 16b and 7b files. Server has 128GB of RAM.

Server memory makes no difference with cuda. Look at the container log and you will see how much vram is needed vs available. I only have a 2gig p400 which wasn't enough but I'm also running CP.AI which is using 1400MB.

dgpugliese commented 10 months ago

Same issue with it just exiting after model is downloaded.

rj-d2 commented 10 months ago

same here just installed in unraid 6.12.4, the docker stops right after it started.

/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
  warnings.warn(
llama.cpp: loading model from /models/llama-2-7b-chat.bin
/models/llama-2-7b-chat.bin model found.
Initializing server with:
Batch size: 2096
Number of CPU threads: 12
Context window: 4096

> ai-chatbot-starter@0.1.0 start
> next start

ready - started server on 0.0.0.0:3000, url: http://localhost:3000
tezgno commented 10 months ago

I'm getting cuda out of memory errors when trying to load the CUDA version: CUDA error 2 at /tmp/pip-install-p6rdknxd/llama-cpp-python_6cf58cabdb694a60bef40fdd1b9f4b5f/vendor/llama.cpp/ggml-cuda.cu:6301: out of memory I get this with the 16b and 7b files. Server has 128GB of RAM.

Server memory makes no difference with cuda. Look at the container log and you will see how much vram is needed vs available. I only have a 2gig p400 which wasn't enough but I'm also running CP.AI which is using 1400MB.

I'm running an RTX A2000 12GB so if it's running completely GPU I would at least think the 7b would start. I'll try switching to CPU and see if that makes a difference.

horphi0815 commented 10 months ago

do we need to set up in docker template the gpu settings?

--runtime=nvidia

Container Variable: NVIDIA_VISIBLE_DEVICES Container Variable: NVIDIA_DRIVER_CAPABILITIES

horphi0815 commented 10 months ago

Container won't start, but no error logs are shown... CUDA Version 12.2.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License. By pulling and using the container, you accept the terms and conditions of this license: https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

/models/llama-2-7b-chat.bin model found. Initializing server with: Batch size: 2096 Number of CPU threads: 24 Context window: 4096

tezgno commented 10 months ago

Swapped for CPU version and still not starting. Same errors about the namespace and such.

dgpugliese commented 10 months ago

I’m using the CPU version as well. Just ends after image downloads. Sent from my iPhoneOn Sep 2, 2023, at 5:52 PM, tezgno @.***> wrote: Swapped for CPU version and still not starting

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

timocapa commented 10 months ago

What CPUs do you people have?

The error is illegal instruction, which to me seems like the CPU doesn't support specific instruction sets that this expects. I believe it currently expects AVX2 as a minimum.

tezgno commented 10 months ago

What CPUs do you people have?

The error is illegal instruction, which to me seems like the CPU doesn't support specific instruction sets that this expects. I believe it currently expects AVX2 as a minimum.

I’m running on an AMD Ryzen 7 3700X, which is a Zen 2 architecture. Zen 2 supports AVX and AVX2.

johnny2678 commented 10 months ago

What CPUs do you people have?

i9 10900 here

rj-d2 commented 10 months ago

What CPUs do you people have?

AMD Ryzen 5 5600

Dustinhoefer commented 10 months ago

i5-8500

zenkraker commented 10 months ago

The model path in the unraid docker settings is still wrong:

Download URL: https://huggingface.co/TheBloke/Nous-Hermes-Llama-2-7B-GGML/resolve/main/nous-hermes-llama-2-7b.ggmlv3.q4_0.bin

Local Model Path: /models/llama-2-7b.bin

But even when these are correct, the docker still exits.

Yes I noticed that I had to fix the path of the URL, but yeah after I fixed the path and downloaded the file manually, still the container does not start. CPU: Ryzen 2700X

corndog2000 commented 10 months ago

i7-7700K

Wallayy commented 10 months ago

My docker will run, but when it starts it times out and says I do not have an openAI api linked. tells me to input in bottom left, but nowhere to be found

Wallayy commented 10 months ago

My docker will run, but when it starts it times out and says I do not have an openAI api linked. tells me to input in bottom left, but nowhere to be found

nevermind, I saw the logs and saw out of vram. changed the GPU layers under more settings in setup to 32 instead of 64 on a 2060.

tezgno commented 10 months ago

So, I was able to get the CUDA version to work after cutting the number from 64 to 32. Looks like it’s using almost 4GB of vram in that configuration. Not sure why 64 doesn’t work on my card, but ok.

CPU version doesn’t work at all. It’s giving an error that complains about the CPU, but all of the ones listed here support AVX2. Can we confirm that the CPU version is for x86_64 or AMD64? I’m thinking it may be the wrong architecture.

jppoeck commented 10 months ago

Hi Guys, same problem here, cannot start after the model download. Also set it to 32 and nothing.

My CPU: Intel Xeon E5-2695 v2

KrX3D commented 10 months ago

i have an Itel CPU i3 8100 which seems to have AVX / AVX2 also i use an Nvidia GT 1030 on my NAS with 2gb vram.

it also failed , but when i set the GPU layers from 64 to 16 it seems to work now

sharpsounds commented 10 months ago

Same issue here. Looks like its an issue from the pydantic library. Seen here and here. @uogbuji then replies and says the newer release fixed the issue.

I have no idea how to do this but comparing 0.1.69 to 0.1.70 for abetlen/llama-cpp-python might reveal a fix

tezgno commented 10 months ago

Same issue here. Looks like its an issue from the pydantic library. Seen here and here. @uogbuji then replies and says the newer release fixed the issue.

I have no idea how to do this but comparing 0.1.69 to 0.1.70 for abetlen/llama-cpp-python might reveal a fix

That is the issue around the model warning error that you see inside of the docker logs, but that isn't what causes the docker container to not start. The container isn't starting due to EXIT Code 132 (you'll notice this on the right when the container shuts down). A Docker Exit Code 132 means that the docker container itself (not the app running inside of it) received an Illegal Instruction to the CPU. This is normally caused when your CPU doesn't contain the instruction sets that the docker is written for (usually an AVX, AVX2 or SSE instruction set). However, based on the CPU's that people have mentioned here, all of them would support AVX, AVX2, and SSE. My theory in this is that the docker container may be written for a different CPU architecture all together (such as AMD64_AVX, which isn't necessarily the same as AMD64 with AVX instructions).

timocapa commented 10 months ago

Or just simply requires AVX512.

tezgno commented 10 months ago

Or just simply requires AVX512.

That could be. Unfortunately it's difficult to see which at this point.

JDolven commented 10 months ago

Is this issue relevant?

When I try to start the Docker i get the Docker Exit Code 132 and this in the unraid log: "traps: python3[15228] trap invalid opcode ip:14d65822c872 sp:7fffc70fbc50 error:0 in libllama.so[14d658211000+62000]"

CPU: i9-9900K

edgar971 commented 10 months ago

I'm getting cuda out of memory errors when trying to load the CUDA version:

CUDA error 2 at /tmp/pip-install-p6rdknxd/llama-cpp-python_6cf58cabdb694a60bef40fdd1b9f4b5f/vendor/llama.cpp/ggml-cuda.cu:6301: out of memory

I get this with the 16b and 7b files. Server has 128GB of RAM.

Can you lower the of number of GPU layers under the Show More Settings? It's set to 64 by default.

edgar971 commented 10 months ago

Same issue here. Looks like its an issue from the pydantic library. Seen here and here. @uogbuji then replies and says the newer release fixed the issue. I have no idea how to do this but comparing 0.1.69 to 0.1.70 for abetlen/llama-cpp-python might reveal a fix

That is the issue around the model warning error that you see inside of the docker logs, but that isn't what causes the docker container to not start. The container isn't starting due to EXIT Code 132 (you'll notice this on the right when the container shuts down). A Docker Exit Code 132 means that the docker container itself (not the app running inside of it) received an Illegal Instruction to the CPU. This is normally caused when your CPU doesn't contain the instruction sets that the docker is written for (usually an AVX, AVX2 or SSE instruction set). However, based on the CPU's that people have mentioned here, all of them would support AVX, AVX2, and SSE. My theory in this is that the docker container may be written for a different CPU architecture all together (such as AMD64_AVX, which isn't necessarily the same as AMD64 with AVX instructions).

The CPU version is having this issue. Everything works as expected using the Docker image I built locally using Ubuntu but not on the Github version. Here's my script:

- run: docker buildx create --use
- run: docker buildx build --platform linux/amd64 -f Dockerfile-${{ matrix.type }} --tag $IMAGE_NAME:${{ github.ref_name }} --push .
- run: docker buildx build --platform linux/amd64 -f Dockerfile-${{ matrix.type }} --tag $IMAGE_NAME:latest --push .
tezgno commented 10 months ago

I'm getting cuda out of memory errors when trying to load the CUDA version: CUDA error 2 at /tmp/pip-install-p6rdknxd/llama-cpp-python_6cf58cabdb694a60bef40fdd1b9f4b5f/vendor/llama.cpp/ggml-cuda.cu:6301: out of memory I get this with the 16b and 7b files. Server has 128GB of RAM.

Can you lower the of number of GPU layers under the Show More Settings? It's set to 64 by default.

I did and when lowered it does work. CPU version does not work.

horphi0815 commented 10 months ago

I'm getting cuda out of memory errors when trying to load the CUDA version: CUDA error 2 at /tmp/pip-install-p6rdknxd/llama-cpp-python_6cf58cabdb694a60bef40fdd1b9f4b5f/vendor/llama.cpp/ggml-cuda.cu:6301: out of memory I get this with the 16b and 7b files. Server has 128GB of RAM.

Can you lower the of number of GPU layers under the Show More Settings? It's set to 64 by default.

Also by reducing to 1 does not work in my case

RTX 2080 ti

edgar971 commented 10 months ago

Same issue here. Looks like its an issue from the pydantic library. Seen here and here. @uogbuji then replies and says the newer release fixed the issue. I have no idea how to do this but comparing 0.1.69 to 0.1.70 for abetlen/llama-cpp-python might reveal a fix

That is the issue around the model warning error that you see inside of the docker logs, but that isn't what causes the docker container to not start. The container isn't starting due to EXIT Code 132 (you'll notice this on the right when the container shuts down). A Docker Exit Code 132 means that the docker container itself (not the app running inside of it) received an Illegal Instruction to the CPU. This is normally caused when your CPU doesn't contain the instruction sets that the docker is written for (usually an AVX, AVX2 or SSE instruction set). However, based on the CPU's that people have mentioned here, all of them would support AVX, AVX2, and SSE. My theory in this is that the docker container may be written for a different CPU architecture all together (such as AMD64_AVX, which isn't necessarily the same as AMD64 with AVX instructions).

The latest build fixes the EXIT Code 132. I added libopenblas-dev and things appear to be working as expected.

horphi0815 commented 10 months ago

how can i get an error log? the container does not start and shows no indication, where the error could occur.

Only the cuda shows information

`CUDA Version 12.2.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License. By pulling and using the container, you accept the terms and conditions of this license: https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

/models/llama-2-7b.bin model found. Initializing server with: Batch size: 2096 Number of CPU threads: 24 Context window: 4096

Press ANY KEY to close this window `

corndog2000 commented 10 months ago

i7-7700K

The latest version of the image worked for me on CPU!

Dustinhoefer commented 10 months ago

i7-7700K

The latest version of the image worked for me on CPU!

Same for me, very slow and the results were quite bad. Uninstalled again :/

timocapa commented 10 months ago

very slow

I'm not sure what you expected

edgar971 commented 10 months ago

Yeah, CPU is slow tbh.

jamescochran commented 9 months ago

I stumbled upon this tonight and with the unraid template for cuda this is still an issue. Seeing the same @horphi0815 up there. Don't see anywhere that error logs would be stored but I did pull the repo and run it with docker compose and still don't see any logs find . -iname "*log*"