[GPU] OPENVINO doesn't work with sriov.

xemxx commented 6 days ago

The bug

I'm trying to enable machine learning for smart search, but the machine-learning container is reporting an error and doesn't work.

The OS that Immich Server is running on

DSM with sriov base on PVE 8.2, I3-12300T 16G RAM

Version of Immich Server

v1.118.2

Version of Immich Mobile App

v1.118.2

Platform with the issue

[X] Server
[ ] Web
[ ] Mobile

Your docker-compose.yml content

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    extends:
      file: hwaccel.transcoding.yml
      service: quicksync # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding
    ports:
      - 2283:2283
    depends_on:
      - redis
      - database
    restart: unless-stopped

  immich-machine-learning:
    container_name: immich_machine_learning
    image: ghcr.io/immich-app/immich-machine-learning:main-openvino
    extends:
      file: hwaccel.ml.yml
      service: openvino # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    volumes:
      - ./model-cache:/cache
    env_file:
      - .env
    restart: unless-stopped
    group_add:
      - 937

  redis:
    container_name: immich_redis
    image: registry.hub.docker.com/library/redis:6.2-alpine@sha256:84882e87b54734154586e5f8abd4dce69fe7311315e2fc6d67c29614c8de2672
    restart: unless-stopped

  database:
    container_name: immich_postgres
    image: registry.hub.docker.com/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
      POSTGRES_INITDB_ARGS: '--data-checksums'
    volumes:
      - ${DB_DATA_LOCATION}:/var/lib/postgresql/data
    restart: unless-stopped
    command: ["postgres", "-c" ,"shared_preload_libraries=vectors.so", "-c", 'search_path="$$user", public, vectors', "-c", "logging_collector=on", "-c", "max_wal_size=2GB", "-c", "shared_buffers=512MB", "-c", "wal_compression=on"]

Your .env content

# You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables

# The location where your uploaded files are stored
UPLOAD_LOCATION=/volume1/homes/xem/immich/library
# The location where your database files are stored
DB_DATA_LOCATION=./postgres

# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release

# Connection secret for postgres. You should change it to a random password
DB_PASSWORD=postgres

# The values below this line do not need to be changed
###################################################################################
DB_USERNAME=postgres
DB_DATABASE_NAME=immich

Reproduction steps

docker compose up -d
Select Smart Scan on the immich web page with the option Missing
Clearing the task queue after fetching the error log and finding 500 returned in the server's logs

Relevant log output

[10/24/24 06:35:38] INFO     Starting gunicorn 23.0.0
[10/24/24 06:35:38] INFO     Listening at: http://[::]:3003 (9)
[10/24/24 06:35:38] INFO     Using worker: app.config.CustomUvicornWorker
[10/24/24 06:35:38] INFO     Booting worker with pid: 10
[10/24/24 06:35:39] DEBUG    Could not load ANN shared libraries, using ONNX:
                             libmali.so: cannot open shared object file: No such
                             file or directory
[10/24/24 06:35:42] INFO     Started server process [10]
[10/24/24 06:35:42] INFO     Waiting for application startup.
[10/24/24 06:35:42] INFO     Created in-memory cache with unloading after 300s
                             of inactivity.
[10/24/24 06:35:42] INFO     Initialized request thread pool with 4 threads.
[10/24/24 06:35:42] DEBUG    Checking for inactivity...
[10/24/24 06:35:42] INFO     Application startup complete.
[10/24/24 06:35:52] DEBUG    Checking for inactivity...
[10/24/24 06:36:02] DEBUG    Checking for inactivity...
[10/24/24 06:36:12] DEBUG    Checking for inactivity...
[10/24/24 06:36:14] DEBUG    Setting model format to onnx
[10/24/24 06:36:14] INFO     Loading visual model 'ViT-B-16-SigLIP__webli' to
                             memory
[10/24/24 06:36:14] DEBUG    Loading visual preprocessing config for CLIP model
                             'ViT-B-16-SigLIP__webli'
[10/24/24 06:36:14] DEBUG    Loaded visual preprocessing config for CLIP model
                             'ViT-B-16-SigLIP__webli'
[10/24/24 06:36:14] DEBUG    Available ORT providers:
                             {'OpenVINOExecutionProvider',
                             'CPUExecutionProvider'}
[10/24/24 06:36:14] DEBUG    Available OpenVINO devices: ['CPU', 'GPU']
[10/24/24 06:36:14] INFO     Setting execution providers to
                             ['OpenVINOExecutionProvider',
                             'CPUExecutionProvider'], in descending order of
                             preference
[10/24/24 06:36:14] DEBUG    Setting execution provider options to
                             [{'device_type': 'GPU.0', 'precision': 'FP32',
                             'cache_dir':
                             '/cache/clip/ViT-B-16-SigLIP__webli/visual/openvino
                             '}, {'arena_extend_strategy': 'kSameAsRequested'}]
[10/24/24 06:36:14] DEBUG    Setting execution_mode to ORT_SEQUENTIAL
[10/24/24 06:36:14] DEBUG    Setting inter_op_num_threads to 0
[10/24/24 06:36:14] DEBUG    Setting intra_op_num_threads to 0
2024-10-24 06:36:31.318900122 [E:onnxruntime:, inference_session.cc:2045 operator()] Exception during initialization: /onnxruntime/onnxruntime/core/providers/openvino/backend_manager.cc:116 onnxruntime::openvino_ep::BackendManager::BackendManager(const onnxruntime::openvino_ep::GlobalContext&, const onnxruntime::Node&, const onnxruntime::GraphViewer&, const onnxruntime::logging::Logger&, onnxruntime::openvino_ep::EPCtxHandler&) /onnxruntime/onnxruntime/core/providers/openvino/ov_interface.cc:114 onnxruntime::openvino_ep::OVExeNetwork onnxruntime::openvino_ep::OVCore::CompileModel(std::__cxx11::string, std::__cxx11::string&, std::__cxx11::string, std::__cxx11::string, ov::AnyMap&, std::__cxx11::string) [OpenVINO-EP]  Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_1_0Exception from src/inference/src/cpp/core.cpp:126:
Exception from src/inference/src/dev/plugin.cpp:54:
Check 'false' failed at src/plugins/intel_gpu/src/plugin/program_builder.cpp:178:
[GPU] ProgramBuilder build failed!
Exception from src/plugins/intel_gpu/src/runtime/ocl/ocl_stream.cpp:433:
[GPU] clWaitForEvents, error code: -14

[10/24/24 06:36:31] DEBUG    Checking for inactivity...
[10/24/24 06:36:31] ERROR    Exception in ASGI application

                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /usr/src/app/main.py:150 in predict             │
                             │                                                 │
                             │ ❱ 150 │   response = await run_inference(inputs │
                             │   151 │   return ORJSONResponse(response)       │
                             │   153                                           │
                             │                                                 │
                             │ ╭────────────────── locals ───────────────────╮ │
                             │ │ entries = (                                 │ │
                             │ │           │   [                             │ │
                             │ │           │   │   {                         │ │
                             │ │           │   │   │   'name':               │ │
                             │ │           'ViT-B-16-SigLIP__webli',         │ │
                             │ │           )                                 │ │
                             │ │   image = b'\xff\xd8\xff\xe2\x01\xf0ICC_PR… │ │
                             │ │           \x00\x00mntrRGB XYZ               │ │
                             │   171 │                                         │
                             │   172 │   without_deps, with_deps = entries     │
                             │   174 │   if with_deps:                         │
                             │   175 │   │   await asyncio.gather(*[_run_infer │
                             │ │                │   [                        │ │
                             │ │                │   │   {                    │ │
                             │ │                │   │   │   'name':          │ │
                             │ │                'ViT-B-16-SigLIP__webli',    │ │
                             │ │      payload = <PIL.JpegImagePlugin.JpegIm… │ │
                             │ │                image mode=RGB               │ │
                             │ │                size=1024x1024 at            │ │
                             │ │                0x7FB0165F0E90>              │ │
                             │ │     response = {}                           │ │
                             │ │    with_deps = []                           │ │
                             │ │ without_deps = [                            │ │
                             │ │                │   {                        │ │
                             │ │                │   │   'name':              │ │
                             │ │   inputs = [                                │ │
                             │ │            │                                │ │
                             │ │            <PIL.JpegImagePlugin.JpegImageF… │ │
                             │ │            image mode=RGB size=1024x1024 at │ │
                             │ │            0x7FB0165F0E90>                  │ │
                             │ │  outputs = {}                               │ │
                             │ │  payload = <PIL.JpegImagePlugin.JpegImageF… │ │
                             │ │            image mode=RGB size=1024x1024 at │ │
                             │ │            0x7FB0165F0E90>                  │ │
                             │ │                       │                     │ │
                             │ │                       'OpenVINOExecutionPr… │ │
                             │ │                       'CPUExecutionProvide… │ │
                             │ │                       ]                     │ │
                             │ │ disabled_optimizers = set()                 │ │
                             │ │                       │   {                 │ │
                             │ │                       │   │                 │ │
                             │ │                       'device_type':        │ │
                             │ │                       'GPU.0',              │ │
                             │ │                       │   │   'precision':  │ │
                             │ │                       'FP32',               │ │
                             │ │                       │   │   'cache_dir':  │ │
                             │ │                       '/cache/clip/ViT-B-1… │ │
                             │ │                       'arena_extend_strate… │ │
                             │ │                       'kSameAsRequested'    │ │
                             │ │                       │   }                 │ │
                             │ │                       ]                     │ │
                             │ │           providers = [                     │ │
                             │ │                       │                     │ │
                             │ │                       'OpenVINOExecutionPr… │ │
                             │ │                       │                     │ │
                             │ │                       'CPUExecutionProvide… │ │
                             │ │                self = <onnxruntime.capi.on… │ │
                             │ │                       object at             │ │
                             │ │                       0x7fb01662edd0>       │ │
                             │ │                       object at             │ │
                             │ │                       0x7fb01b7f6670>       │ │
                             │ │     session_options = <onnxruntime.capi.on… │ │
                             /onnxruntime/onnxruntime/core/providers/openvino/ba
                             ckend_manager.cc:116
                             onnxruntime::openvino_ep::BackendManager::BackendMa
                             nager(const
                             onnxruntime::openvino_ep::GlobalContext&, const
                             :__cxx11::string, std::__cxx11::string&,
                             std::__cxx11::string, std::__cxx11::string,
                             ov::AnyMap&, std::__cxx11::string) [OpenVINO-EP]
                             Exception while Loading Network for graph:

[10/24/24 06:36:31] INFO     Attempt #2 to load visual model
                             'ViT-B-16-SigLIP__webli' to memory
[10/24/24 06:36:31] DEBUG    Available ORT providers:
                             {'OpenVINOExecutionProvider',
                             'CPUExecutionProvider'}
[10/24/24 06:36:31] DEBUG    Available OpenVINO devices: ['CPU', 'GPU']
[10/24/24 06:36:31] INFO     Setting execution providers to
                             ['OpenVINOExecutionProvider',
                             'CPUExecutionProvider'], in descending order of
                             preference
[10/24/24 06:36:31] DEBUG    Setting execution provider options to
                             [{'device_type': 'GPU.0', 'precision': 'FP32',
                             'cache_dir':
                             '/cache/clip/ViT-B-16-SigLIP__webli/visual/openvino
                             '}, {'arena_extend_strategy': 'kSameAsRequested'}]
[10/24/24 06:36:31] DEBUG    Setting execution_mode to ORT_SEQUENTIAL
[10/24/24 06:36:31] DEBUG    Setting inter_op_num_threads to 0
[10/24/24 06:36:31] DEBUG    Setting intra_op_num_threads to 0
2024-10-24 06:36:34.045946263 [E:onnxruntime:, inference_session.cc:2045 operator()] Exception during initialization: /onnxruntime/onnxruntime/core/providers/openvino/backend_manager.cc:116 onnxruntime::openvino_ep::BackendManager::BackendManager(const onnxruntime::openvino_ep::GlobalContext&, const onnxruntime::Node&, const onnxruntime::GraphViewer&, const onnxruntime::logging::Logger&, onnxruntime::openvino_ep::EPCtxHandler&) /onnxruntime/onnxruntime/core/providers/openvino/ov_interface.cc:114 onnxruntime::openvino_ep::OVExeNetwork onnxruntime::openvino_ep::OVCore::CompileModel(std::__cxx11::string, std::__cxx11::string&, std::__cxx11::string, std::__cxx11::string, ov::AnyMap&, std::__cxx11::string) [OpenVINO-EP]  Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_2_0Exception from src/inference/src/cpp/core.cpp:126:
Exception from src/inference/src/dev/plugin.cpp:54:
Check 'false' failed at src/plugins/intel_gpu/src/plugin/program_builder.cpp:178:

[10/24/24 06:36:34] ERROR    Exception in ASGI application

                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /usr/src/app/main.py:150 in predict             │
                             │                                                 │
                             │   147 │   │   inputs = text                     │
                             │   148 │   else:                                 │
                             │   149 │   │   raise HTTPException(400, "Either  │
                             │ ❱ 150 │   response = await run_inference(inputs │
                             │   151 │   return ORJSONResponse(response)       │
                             │   152                                           │
                             │   153                                           │
                             │                                                 │
                             │ ╭────────────────── locals ───────────────────╮ │
                             │ │ entries = (                                 │ │
                             │ │           │   [                             │ │
                             │ │           │   │   {                         │ │
                             │ │           │   │   │   'name':               │ │
                             │ │           'ViT-B-16-SigLIP__webli',         │ │
                             │ │           │   │   │   'task': 'clip',       │ │
                             │ │           │   │   │   'type': 'visual',     │ │
                             │ │           │   │   │   'options': {}         │ │
                             │ │           │   │   }                         │ │
                             │ │           │   ],                            │ │
                             │ │           │   []                            │ │
                             │ │           )                                 │ │
                             │ │   image = b'\xff\xd8\xff\xe2\x01\xf0ICC_PR… │ │
                             │ │           \x00\x00mntrRGB XYZ               │ │
                             │ │           \x07\xe2\x00\x03\x00\x14\x00\t\x… │ │
                             │ │  inputs = <PIL.JpegImagePlugin.JpegImageFi… │ │
                             │ │           image mode=RGB size=1920x1080 at  │ │
                             │ │           0x7FB00EEC0E10>                   │ │
                             │ │    text = None                              │ │
                             │                                                 │
                             │ /usr/src/app/main.py:173 in run_inference       │
                             │                                                 │
                             │   170 │   │   response[entry["task"]] = output  │
                             │   171 │                                         │
                             │   172 │   without_deps, with_deps = entries     │
                             │ ❱ 173 │   await asyncio.gather(*[_run_inference │
                             │   176 │   if isinstance(payload, Image):        │
                             │                                                 │
                             │ ╭────────────────── locals ───────────────────╮ │
                             │ │      entries = (                            │ │
                             │ │                image mode=RGB               │ │
                             │ │                size=1920x1080 at            │ │
                             │ │                0x7FB00EEC0E10>              │ │
                             │ │     response = {}                           │ │
                             │ ╰─────────────────────────────────────────────╯ │
                             │                                                 │
                             │ /usr/src/app/main.py:167 in _run_inference      │
                             │                                                 │
                             │   164 │   │   │   except KeyError:              │
                             │   165 │   │   │   │   message = f"Task {entry[' │
                             │       output of {dep}"                          │
                             │   166 │   │   │   │   raise HTTPException(400,  │
                             │ ❱ 167 │   │   model = await load(model)         │
                             │   168 │   │   output = await run(model.predict, │
                             │   169 │   │   outputs[model.identity] = output  │
                             │                                                 │
                             │ ╭────────────────── locals ───────────────────╮ │
                             │ │    entry = {                                │ │
                             │ │            │   'name':                      │ │
                             │ │            'ViT-B-16-SigLIP__webli',        │ │
                             │ │            │   'task': 'clip',              │ │
                             │ │            │   'type': 'visual',            │ │
                             │ │            │   'options': {}                │ │
                             │ │            }                                │ │
                             │ │   inputs = [                                │ │
                             │ │            │                                │ │
                             │ │            <PIL.JpegImagePlugin.JpegImageF… │ │
                             │ │            image mode=RGB size=1920x1080 at │ │
                             │ │            0x7FB00EEC0E10>                  │ │
                             │                                                 │
                             │   208 │   │   return model                      │
                             │   209 │                                         │
                             │   210 │   try:                                  │
                             │ ❱ 211 │   │   return await run(_load, model)    │
                             │   213 │   │   log.warning(f"Failed to load {mod │
                             │       '{model.model_name}'. Clearing cache.")   │
                             │   214 │   │   model.clear_cache()               │
                             │                                                 │
                             │ ╭────────────────── locals ───────────────────╮ │
                             │ │ model = <app.models.clip.visual.OpenClipVi… │ │
                             │ │         object at 0x7fb0168c8d90>           │ │
                             │ ╰─────────────────────────────────────────────╯ │
                             │                                                 │
                             │ /usr/src/app/main.py:186 in run                 │
                             │                                                 │
                             │   183 │   if thread_pool is None:               │
                             │   184 │   │   return func(*args, **kwargs)      │
                             │   187                                           │
                             │   188                                           │
                             │   189 async def load(model: InferenceModel) ->  │
                             │                                                 │
                             │ ╭────────────────── locals ───────────────────╮ │
                             │ │         args = (                            │ │
                             │ │                │                            │ │
                             │ │                <app.models.clip.visual.Ope… │ │
                             │ │                object at 0x7fb0168c8d90>,   │ │
                             │ │                )                            │ │
                             │ │       kwargs = {}                           │ │
                             │ │ partial_func = functools.partial(<function  │ │
                             │ │                load.<locals>._load at       │ │
                             │ │                0x7fb00efa1440>,             │ │
                             │ │                <app.models.clip.visual.Ope… │ │
                             │ │                object at 0x7fb0168c8d90>)   │ │
                             │ ╰─────────────────────────────────────────────╯ │
                             │                                                 │
                             │ /usr/local/lib/python3.11/concurrent/futures/th │
                             │ read.py:58 in run                               │
                             │                                                 │
                             │                                                 │
                             │   195 │   │   │   raise HTTPException(500, f"Fa │
                             │   196 │   │   with lock:                        │
                             │   197 │   │   │   try:                          │
                             │ ❱ 198 │   │   │   │   model.load()              │
                             │   199 │   │   │   except FileNotFoundError as e │
                             │   200 │   │   │   │   if model.model_format ==  │
                             │   201 │   │   │   │   │   raise e               │
                             │                                                 │
                             │ ╭────────────────── locals ───────────────────╮ │
                             │ │ model = <app.models.clip.visual.OpenClipVi… │ │
                             │                                                 │
                             │    50 │   │   self.download()                   │
                             │    51 │   │   attempt = f"Attempt #{self.load_a │
                             │       else "Loading"                            │
                             │    52 │   │   log.info(f"{attempt} {self.model_ │
                             │       '{self.model_name}' to memory")           │
                             │ ❱  53 │   │   self.session = self._load()       │
                             │    54 │   │   self.loaded = True                │
                             │                                                 │
                             │    416 │   │   disabled_optimizers = kwargs.get │
                             │    417 │   │                                    │
                             │    418 │   │   try:                             │
                             │ ❱  419 │   │   │   self._create_inference_sessi │
                             │        disabled_optimizers)                     │
                             │    420 │   │   except (ValueError, RuntimeError │
                             │    421 │   │   │   if self._enable_fallback:    │
                             │    422 │   │   │   │   try:                     │
                             │                                                 │
                             │ ╭────────────────── locals ───────────────────╮ │
                             │ │ disabled_optimizers = None                  │ │
                             │ │              kwargs = {}                    │ │
                             │ │       path_or_bytes = '/cache/clip/ViT-B-1… │ │
                             │ │    provider_options = [                     │ │
                             │ │                       │   │                 │ │
                             │ │                       'device_type':        │ │
                             │ │                       'GPU.0',              │ │
                             │ │                       │   │   'precision':  │ │
                             │ │                       'FP32',               │ │
                             │ │                       │   │   'cache_dir':  │ │
                             │ │                       '/cache/clip/ViT-B-1… │ │
                             │ │                       │   },                │ │
                             │ │                       │   {                 │ │
                             │ │                       ]                     │ │
                             │ │                self = <onnxruntime.capi.on… │ │
                             │ │                       object at             │ │
                             │ │                       0x7fb0168391d0>       │ │
                             │ │        sess_options = <onnxruntime.capi.on… │ │
                             │ │                       0x7fb0166355f0>       │ │
                             │ ╰─────────────────────────────────────────────╯ │
                             │    486 │   │   self._sess_options = self._sess. │
                             │                                                 │
                             │ ╭────────────────── locals ───────────────────╮ │
                             │ │ available_providers = [                     │ │
                             │ │                       │                     │ │
                             │ │                       'OpenVINOExecutionPr… │ │
                             │ │ disabled_optimizers = set()                 │ │
                             │ │    provider_options = [                     │ │
                             │ │                       │   {                 │ │
                             │ │                       │   │                 │ │
                             │ │                       'device_type':        │ │
                             │ │                       'GPU.0',              │ │
                             │ │                       │   │   'precision':  │ │
                             │ │                       'FP32',               │ │
                             │ │                       │   {                 │ │
                             │ │                       │   │                 │ │
                             │ │                       'arena_extend_strate… │ │
                             │ │                       'kSameAsRequested'    │ │
                             │ │                       │   }                 │ │
                             │ │                       ]                     │ │
                             │ │           providers = [                     │ │
                             │ │                       │                     │ │
                             │ │                       'OpenVINOExecutionPr… │ │
                             │ │                       │                     │ │
                             │ │                       'CPUExecutionProvide… │ │
                             │ │                       ]                     │ │
                             │ │                self = <onnxruntime.capi.on… │ │
                             │ │                       object at             │ │
                             │ │                       0x7fb0168391d0>       │ │
                             │ │                sess = <onnxruntime.capi.on… │ │
                             │ │                       object at             │ │
                             │ │                       0x7fb0168e3f70>       │ │
                             │ │     session_options = <onnxruntime.capi.on… │ │
                             │ │                       object at             │ │
                             │ │                       0x7fb0166355f0>       │ │
                             │ ╰─────────────────────────────────────────────╯ │
                             ╰─────────────────────────────────────────────────╯
                             RuntimeException: [ONNXRuntimeError] : 6 :
                             RUNTIME_EXCEPTION : Exception during
                             initialization:
                             /onnxruntime/onnxruntime/core/providers/openvino/ba
                             ckend_manager.cc:116
                             onnxruntime::openvino_ep::BackendManager::BackendMa
                             nager(const
                             onnxruntime::openvino_ep::GlobalContext&, const
                             onnxruntime::Node&, const
                             onnxruntime::GraphViewer&, const
                             onnxruntime::logging::Logger&,
                             onnxruntime::openvino_ep::EPCtxHandler&)
                             /onnxruntime/onnxruntime/core/providers/openvino/ov
                             _interface.cc:114
                             onnxruntime::openvino_ep::OVCore::CompileModel(std:
                             :__cxx11::string, std::__cxx11::string&,
                             std::__cxx11::string, std::__cxx11::string,
                             ov::AnyMap&, std::__cxx11::string) [OpenVINO-EP]
                             Exception while Loading Network for graph:
                             OpenVINOExecutionProvider_OpenVINO-EP-subgraph_2_0E
                             xception from src/inference/src/cpp/core.cpp:126:
                             Exception from src/inference/src/dev/plugin.cpp:54:
                             Check 'false' failed at
                             src/plugins/intel_gpu/src/plugin/program_builder.cp
                             p:178:
                             [GPU] ProgramBuilder build failed!
                             Exception from
                             p:433:
                             [GPU] clWaitForEvents, error code: -14

Additional information

My PVE is 16G of RAM and offers 8G to the DSM

I should have enough memory in the run, here is my grafana diagram

Before the error log output, my GPU was loaded

bo0tzz commented 6 days ago

Is this running in a VM? Can you give some more detail about that?

zhouhaoo commented 5 days ago

i have same problem。pve with sriov

[10/25/24 10:18:12] INFO     Starting gunicorn 23.0.0                           

[10/25/24 10:18:12] INFO     Listening at: http://[::]:3003 (9)                 

[10/25/24 10:18:12] INFO     Using worker: app.config.CustomUvicornWorker       

[10/25/24 10:18:12] INFO     Booting worker with pid: 10                        

[10/25/24 10:18:16] INFO     Started server process [10]                        

[10/25/24 10:18:16] INFO     Waiting for application startup.                   

[10/25/24 10:18:16] INFO     Created in-memory cache with unloading after 300s  

                             of inactivity.                                     

[10/25/24 10:18:16] INFO     Initialized request thread pool with 4 threads.    

[10/25/24 10:18:16] INFO     Application startup complete.                      

[10/25/24 10:18:46] INFO     Loading detection model 'antelopev2' to memory     

[10/25/24 10:18:46] INFO     Setting execution providers to                     

                             ['OpenVINOExecutionProvider',                      

                             'CPUExecutionProvider'], in descending order of    

                             preference                                         

2024-10-25 10:19:04.571404921 [E:onnxruntime:, inference_session.cc:2045 operator()] Exception during initialization: /onnxruntime/onnxruntime/core/providers/openvino/ov_interface.cc:85 onnxruntime::openvino_ep::OVExeNetwork onnxruntime::openvino_ep::OVCore::CompileModel(std::shared_ptr<const ov::Model>&, std::__cxx11::string&, ov::AnyMap&, std::__cxx11::string) [OpenVINO-EP]  Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_1_0Exception from src/inference/src/cpp/core.cpp:109:

Exception from src/inference/src/dev/plugin.cpp:54:

Exception from src/plugins/intel_gpu/src/runtime/ocl/ocl_stream.cpp:433:

[GPU] clWaitForEvents, error code: -14

[10/25/24 10:19:04] ERROR    Exception in ASGI application                      

                             ╭─────── Traceback (most recent call last) ───────╮

                             │ /usr/src/app/main.py:150 in predict             │

                             │                                                 │

                             │   147 │   │   inputs = text                     │

                             │   148 │   else:                                 │

                             │   149 │   │   raise HTTPException(400, "Either  │

                             │ ❱ 150 │   response = await run_inference(inputs │

                             │   151 │   return ORJSONResponse(response)       │

                             │   152                                           │

                             │   153                                           │

                             │                                                 │

                             │ /usr/src/app/main.py:173 in run_inference       │

                             │                                                 │

                             │   170 │   │   response[entry["task"]] = output  │

                             │   171 │                                         │

                             │   172 │   without_deps, with_deps = entries     │

                             │ ❱ 173 │   await asyncio.gather(*[_run_inference │

                             │   174 │   if with_deps:                         │

                             │   175 │   │   await asyncio.gather(*[_run_infer │

                             │   176 │   if isinstance(payload, Image):        │

                             │                                                 │

                             │ /usr/src/app/main.py:167 in _run_inference      │

                             │                                                 │

                             │   164 │   │   │   except KeyError:              │

                             │   165 │   │   │   │   message = f"Task {entry[' │

                             │       output of {dep}"                          │

                             │   166 │   │   │   │   raise HTTPException(400,  │

                             │ ❱ 167 │   │   model = await load(model)         │

                             │   168 │   │   output = await run(model.predict, │

                             │   169 │   │   outputs[model.identity] = output  │

                             │   170 │   │   response[entry["task"]] = output  │

                             │                                                 │

                             │ /usr/src/app/main.py:211 in load                │

                             │                                                 │

                             │   208 │   │   return model                      │

                             │   209 │                                         │

                             │   210 │   try:                                  │

                             │ ❱ 211 │   │   return await run(_load, model)    │

                             │   212 │   except (OSError, InvalidProtobuf, Bad │

                             │   213 │   │   log.warning(f"Failed to load {mod │

                             │       '{model.model_name}'. Clearing cache.")   │

                             │   214 │   │   model.clear_cache()               │

                             │                                                 │

                             │ /usr/src/app/main.py:186 in run                 │

                             │                                                 │

                             │   183 │   if thread_pool is None:               │

                             │   184 │   │   return func(*args, **kwargs)      │

                             │   185 │   partial_func = partial(func, *args, * │

                             │ ❱ 186 │   return await asyncio.get_running_loop │

                             │   187                                           │

                             │   188                                           │

                             │   189 async def load(model: InferenceModel) ->  │

                             │                                                 │

                             │ /usr/local/lib/python3.11/concurrent/futures/th │

                             │ read.py:58 in run                               │

                             │                                                 │

                             │ /usr/src/app/main.py:198 in _load               │

                             │                                                 │

                             │   195 │   │   │   raise HTTPException(500, f"Fa │

                             │   196 │   │   with lock:                        │

                             │   197 │   │   │   try:                          │

                             │ ❱ 198 │   │   │   │   model.load()              │

                             │   199 │   │   │   except FileNotFoundError as e │

                             │   200 │   │   │   │   if model.model_format ==  │

                             │   201 │   │   │   │   │   raise e               │

                             │                                                 │

                             │ /usr/src/app/models/base.py:53 in load          │

                             │                                                 │

                             │    50 │   │   self.download()                   │

                             │    51 │   │   attempt = f"Attempt #{self.load_a │

                             │       else "Loading"                            │

                             │    52 │   │   log.info(f"{attempt} {self.model_ │

                             │       '{self.model_name}' to memory")           │

                             │ ❱  53 │   │   self.session = self._load()       │

                             │    54 │   │   self.loaded = True                │

                             │    55 │                                         │

                             │    56 │   def predict(self, *inputs: Any, **mod │

                             │                                                 │

                             │ /usr/src/app/models/facial_recognition/detectio │

                             │ n.py:21 in _load                                │

                             │                                                 │

                             │   18 │   │   super().__init__(model_name, **mod │

                             │   19 │                                          │

                             │   20 │   def _load(self) -> ModelSession:       │

                             │ ❱ 21 │   │   session = self._make_session(self. │

                             │   22 │   │   self.model = RetinaFace(session=se │

                             │   23 │   │   self.model.prepare(ctx_id=0, det_t │

                             │   24                                            │

                             │                                                 │

                             │ /usr/src/app/models/base.py:110 in              │

                             │ _make_session                                   │

                             │                                                 │

                             │   107 │   │   │   case ".armnn":                │

                             │   108 │   │   │   │   session: ModelSession = A │

                             │   109 │   │   │   case ".onnx":                 │

                             │ ❱ 110 │   │   │   │   session = OrtSession(mode │

                             │   111 │   │   │   case _:                       │

                             │   112 │   │   │   │   raise ValueError(f"Unsupp │

                             │   113 │   │   return session                    │

                             │                                                 │

                             │ /usr/src/app/sessions/ort.py:28 in __init__     │

                             │                                                 │

                             │    25 │   │   self.providers = providers if pro │

                             │    26 │   │   self.provider_options = provider_ │

                             │       self._provider_options_default            │

                             │    27 │   │   self.sess_options = sess_options  │

                             │       self._sess_options_default                │

                             │ ❱  28 │   │   self.session = ort.InferenceSessi │

                             │    29 │   │   │   self.model_path.as_posix(),   │

                             │    30 │   │   │   providers=self.providers,     │

                             │    31 │   │   │   provider_options=self.provide │

                             │                                                 │

                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │

                             │ ime/capi/onnxruntime_inference_collection.py:41 │

                             │ 9 in __init__                                   │

                             │                                                 │

                             │    416 │   │   disabled_optimizers = kwargs.get │

                             │    417 │   │                                    │

                             │    418 │   │   try:                             │

                             │ ❱  419 │   │   │   self._create_inference_sessi │

                             │        disabled_optimizers)                     │

                             │    420 │   │   except (ValueError, RuntimeError │

                             │    421 │   │   │   if self._enable_fallback:    │

                             │    422 │   │   │   │   try:                     │

                             │                                                 │

                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │

                             │ ime/capi/onnxruntime_inference_collection.py:48 │

                             │ 3 in _create_inference_session                  │

                             │                                                 │

                             │    480 │   │   │   disabled_optimizers = set(di │

                             │    481 │   │                                    │

                             │    482 │   │   # initialize the C++ InferenceSe │

                             │ ❱  483 │   │   sess.initialize_session(provider │

                             │    484 │   │                                    │

                             │    485 │   │   self._sess = sess                │

                             │    486 │   │   self._sess_options = self._sess. │

                             ╰─────────────────────────────────────────────────╯

                             RuntimeException: [ONNXRuntimeError] : 6 :         

                             RUNTIME_EXCEPTION : Exception during               

                             initialization:                                    

                             /onnxruntime/onnxruntime/core/providers/openvino/ov

                             _interface.cc:85                                   

                             onnxruntime::openvino_ep::OVExeNetwork             

                             onnxruntime::openvino_ep::OVCore::CompileModel(std:

                             :shared_ptr<const ov::Model>&,                     

                             std::__cxx11::string&, ov::AnyMap&,                

                             std::__cxx11::string) [OpenVINO-EP]  Exception     

                             while Loading Network for graph:                   

                             OpenVINOExecutionProvider_OpenVINO-EP-subgraph_1_0E

                             xception from src/inference/src/cpp/core.cpp:109:  

                             Exception from src/inference/src/dev/plugin.cpp:54:

                             Exception from                                     

                             src/plugins/intel_gpu/src/runtime/ocl/ocl_stream.cp

                             p:433:                                             

                             [GPU] clWaitForEvents, error code: -14

xemxx commented 5 days ago

Is this running in a VM? Can you give some more detail about that?

Yes, my physical machine runs PVE and is configured with sriov to share HD730 cores, then I run a VM with docker installed to run immich. This VM is also running jellyfin and works fine with hardware acceleration!

bo0tzz commented 4 days ago

cc @mertalev

mertalev commented 4 days ago

There are similar issues for Frigate here and here. Seems like a kernel issue?

zhouhaoo commented 4 days ago

hardware-transcoding is fine，but Hardware-Accelerated Machine Learning with sriov not work

immich-app / immich