immich-app / immich

High performance self-hosted photo and video management solution.
https://immich.app
GNU Affero General Public License v3.0
51.43k stars 2.72k forks source link

immich-machine-learning openvino not working (transcoding works) #9523

Closed Str1atum closed 5 months ago

Str1atum commented 5 months ago

The bug

Hi all,

I configured ubuntu in docker on Ubuntu 24.04 on a proxmox VM with PCIe passthrough for my Intel 13th gen Iris Xe So far intel_gpu_top and hardware transcoding is working fine in immich (can see the utilization in intel_gpu_top and speed is fine) but I cannot get the ml part to use the GPU with openvino The error in the logs is just: WARNING No GPU device found in OpenVINO. Falling back to CPU

The OS that Immich Server is running on

Ubuntu 24.04

Version of Immich Server

v1.105.1

Version of Immich Mobile App

v1.105.1

Platform with the issue

Your docker-compose.yml content

immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-openvino
    extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
      file: hwaccel.ml.yml
      service: openvino # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: unless-stopped

Your .env content

standard .env. just with folders / passwords changed

Reproduction steps

see above

Relevant log output

WARNING No GPU device found in OpenVINO. Falling back to CPU

Additional information

No response

mertalev commented 5 months ago

I'm guessing you're either missing a driver, or 24.04 is too new for the OpenVINO version in the container. Also, what model is the CPU? I last tested it on a 13700H.

Str1atum commented 5 months ago

It's i9-13900H

I tried with bookworm Debian 6.1 kernel but had no success at all.

mertalev commented 5 months ago

Hmm, you can try looking at some of the drivers mentioned in this page. This page also suggests using --group-add. For Compose files, this would mean first running stat -c \"%g\" /dev/dri/render*, then adding the output as:

...
group_add:
  - <output>

Also, for what it's worth, I tested on Fedora.

Str1atum commented 5 months ago

No success - driver installation went smooth but no changes. And I added this to my hwaccel.ml.ml

openvino: device_cgroup_rules:

mertalev commented 5 months ago

What's the output of ls -l /dev/dri? Also, can you set LOG_LEVEL=debug, try again and find the line Available OpenVINO devices: in the ML logs?

Str1atum commented 5 months ago

ls -l /dev/dri:

total 0 drwxr-xr-x 2 root root 100 May 16 14:49 by-path crw-rw---- 1 root video 226, 0 May 16 14:49 card0 crw-rw---- 1 root video 226, 1 May 16 14:49 card1 crw-rw---- 1 root render 226, 128 May 16 14:49 renderD128

Screenshot 16 05 2024 um 18 39 04 PM

mertalev commented 5 months ago

Can you also add the output of stat -c \"%g\" /dev/dri/card* to group_add? I think the OpenVINO image is non-root, so it may not be able to access the devices in /dev/dri.

Str1atum commented 5 months ago

Screenshot 16 05 2024 um 21 50 06 PM

Screenshot 16 05 2024 um 21 50 20 PM

but still the same error

Screenshot 16 05 2024 um 21 50 57 PM

mertalev commented 5 months ago

Try adding this:

    environment:
      - NEOReadDebugKeys=1
      - OverrideGpuAddressSpace=48
Str1atum commented 5 months ago

that seems to have worked

Screenshot 16 05 2024 um 23 55 22 PM

mertalev commented 5 months ago

Sweet! That means it's a bug in intel-compute-runtime.

Str1atum commented 5 months ago

So anything we can do about that or do I just need to keep that compose extension forever?

mertalev commented 5 months ago

We would need the latest patched release for intel-compute-runtime in the image, which will take some time to be available in package managers. You can just keep those envs around until then.

mertalev commented 5 months ago

I decided to add these to the hwaccel.ml.yml file in the meantime: #9541.

vuongtt92 commented 5 months ago

I added this to the openvino section in hwaccel.ml.yml but it still doesn't work. Mine is intel N5015 installing Xpenology 7.2. Could you please help? The log said "no gpu device found in openvino"

environment:
      - NEOReadDebugKeys=1
      - OverrideGpuAddressSpace=48
mertalev commented 5 months ago

That fix only applies to kernels 6.7.5 or newer. I don't imagine a Synology server would have such a new kernel. It could be that the kernel is too old instead and/or that you're missing a driver.

vuongtt92 commented 5 months ago

so there's nothing I can do about it? I see that hardware accelerator works well with Plex while transcoding but I've tried many fixes but can't get it work with immich ML

mertalev commented 5 months ago

I can't say for certain, but I wouldn't be surprised if it's impossible without a newer host environment.

But on the other hand, your processor would likely run into other issues with OpenVINO, so you may have had to use CPU instead anyway. I can only recommend OpenVINO for Iris Xe and Arc graphics - anything else is too unreliable from what I've seen.