Open cernyjan opened 1 month ago
Problem was in ending and format of the file, after delete of blank ending new line and convert from CLRF into LF, it started to work,. :) FROM INTO
Anyway, chat with ollama does not work in the very end,. :(
Unfortunately I also see that Intel integrated graphics for 13th Gen processors and older is not supported. "Intel(R) Iris(R) Xe Graphics. It does appear to work with 14th Gen Processors and its integrated graphics. I can updated the README.md to clarify these findings
Integrated Iris Xe GPU on 12th Gen Intel Core i7-12700H
with Alder Lake-P GT2 [Iris Xe Graphics] driver: i915
works fine.
If you run it on linux and have two GPU's like me, to get it works only one GPU could be provided to ollama service at time.
inxi -G
Graphics:
Device-1: Intel Alder Lake-P GT2 [Iris Xe Graphics] driver: i915 v: kernel
Device-2: Intel DG2 [Arc A770M] driver: i915 v: kernel
The device should be mapped eplicitly in docker-compose.yml
, instead of mapping full /dev/dri
List system devices:
$ lsgpu
card1 Intel Dg2 (Gen12) drm:/dev/dri/card1
└─renderD129 drm:/dev/dri/renderD129
card0 Intel Alderlake_p (Gen12) drm:/dev/dri/card0
└─renderD128 drm:/dev/dri/renderD128
Docker composition:
services:
ollama-intel-gpu:
build:
context: .
dockerfile: Dockerfile
container_name: ollama-intel-gpu
image: ollama-intel-gpu:latest
restart: always
devices:
- /dev/dri/renderD129:/dev/dri/renderD129
volumes:
- /tmp/.X11-unix:/tmp/.X11-unix
- ollama-intel-gpu:/root/.ollama
I would like to note that the speed of inference through the integrated GPU, evaluated visually, is less than two times inferior, which is also an excellent result, compared to the speed on a bare CPU.
If you want to use both GPUs at once, this is how it works in parallel:
services:
ollama-intel-arc-gpu:
build:
context: .
dockerfile: Dockerfile
container_name: ollama-intel-arc-gpu
image: ollama-intel-gpu:latest
restart: always
shm_size: "32gb" # <-- not sure, if it helps
#privileged: true # <-- don't do it, otherwise all /dev/dri are submitted to guest
devices:
#- /dev/dri:/dev/dri
- /dev/dri/renderD129:/dev/dri/renderD129
#- /dev/dri/renderD128:/dev/dri/renderD128
volumes:
- /tmp/.X11-unix:/tmp/.X11-unix
- ollama-intel-gpu:/root/.ollama
environment:
- DISPLAY=${DISPLAY}
env_file:
- .env
ollama-intel-cpu-gpu:
build:
context: .
dockerfile: Dockerfile
container_name: ollama-intel-cpu-gpu
image: ollama-intel-gpu:latest
restart: always
shm_size: "32gb" # <-- not sure, if it helps
#privileged: true # <-- don't do it, otherwise all /dev/dri are submitted to guest
devices:
- /dev/dri/renderD128:/dev/dri/renderD128
volumes:
- /tmp/.X11-unix:/tmp/.X11-unix
- ollama-intel-gpu:/root/.ollama
environment:
- DISPLAY=${DISPLAY}
env_file:
- .env
ollama-webui:
image: ghcr.io/open-webui/open-webui:v0.3.10
container_name: ollama-webui
volumes:
- ollama-webui:/app/backend/data
depends_on:
- ollama-intel-arc-gpu
- ollama-intel-cpu-gpu
ports:
- ${OLLAMA_WEBUI_PORT-3000}:8080
environment:
- OLLAMA_BASE_URL=http://ollama-intel-arc-gpu:11434;http://ollama-intel-cpu-gpu:11434
extra_hosts:
- host.docker.internal:host-gateway
restart: unless-stopped
volumes:
ollama-webui: {}
ollama-intel-gpu: {}
I installed(downloaded) llama3.2 1b and I get the Ollama 500: error I would install another if I knew for sure that it would work. I have data cap coming close so all these extra models will add up literally. Any suggestions? Running Intel Arc A770 16gb Driver good etc. Thanks in advance.
@BDDwaCT, yesterday I updated libraries in main. Could you rebuild?
@BDDwaCT, yesterday I updated libraries in main. Could you rebuild?
What time UTC did you do this for I freshly downloaded at approximately 11:00 PM UTC on 10/24/24?
Just curious if you already had completed the update. Also should I try a different library or is it something else in your opinion?
I know I said libraries in my last comment, what I meant was different llm model. Thanks
I installed(downloaded) llama3.2 1b and I get the Ollama 500: error I would install another if I knew for sure that it would work. I have data cap coming close so all these extra models will add up literally. Any suggestions? Running Intel Arc A770 16gb Driver good etc. Thanks in advance.
I'm reproducing the same issue pulling this repo fresh. In the log I see this error "ollama_llama_server: error while loading shared libraries: libmkl_sycl_blas.so.4: cannot open shared object file: No such file or directory". But in the container I see:
root@10637c90384e:/opt/intel/oneapi# find . -name libmkl_sycl_blas.so*
./2025.0/lib/libmkl_sycl_blas.so
./2025.0/lib/libmkl_sycl_blas.so.5
./mkl/2025.0/lib/libmkl_sycl_blas.so
./mkl/2025.0/lib/libmkl_sycl_blas.so.5
This seems to be llbrary mismatch between ipex-llm and oneAPI
@BDDwaCT can you try the fix in https://github.com/mattcurf/ollama-intel-gpu/pull/6 and report back if that resolves your issue?
EDIT: Please see my comment in #6 Thanks
I will try here in just a minute however I just wanted to share with you that inside my /onapi# as of right this minute shows this: root@af5042ac6a4d:/opt/intel/oneapi# find . -name libmkl_sycl_blas.so* ./2024.2/lib/libmkl_sycl_blas.so ./2024.2/lib/libmkl_sycl_blas.so.4 ./mkl/2024.2/lib/libmkl_sycl_blas.so ./mkl/2024.2/lib/libmkl_sycl_blas.so.4 root@af5042ac6a4d:/opt/intel/oneapi#
Just FYI. Now I will go and try the fix in #6 mentioned above and will report back. Thanks
Hi, I am fasing this problem after run 'docker-compose -f docker-compose-wsl2.yml up' on Windows notebook with Intel GPU:
Please help, thank you in advance.