fwmone commented 6 months ago

The bug

When enabling the immich-machine-learning openvino image, search is not working anymore but throws an HTTP 500 error. The immich-machine-learning log throws:

`[03/29/24 09:48:06] INFO Loading clip model 'ViT-B-32__openai' to memory

2024-03-29 09:48:09.988640928 [E:onnxruntime:, inference_session.cc:1985 Initialize] Encountered unknown exception in Initialize()

[03/29/24 09:48:09] ERROR Exception in ASGI application

                         ╭─────── Traceback (most recent call last) ───────╮

                         │ /usr/src/app/main.py:116 in predict             │

                         │                                                 │

                         │   113 │   except orjson.JSONDecodeError:        │

                         │   114 │   │   raise HTTPException(400, f"Invali │

                         │   115 │                                         │

                         │ ❱ 116 │   model = await load(await model_cache. │

                         │       ttl=settings.model_ttl, **kwargs))        │

                         │   117 │   model.configure(**kwargs)             │

                         │   118 │   outputs = await run(model.predict, in │

                         │   119 │   return ORJSONResponse(outputs)        │

                         │                                                 │

                         │ /usr/src/app/main.py:137 in load                │

                         │                                                 │

                         │   134 │   │   │   model.load()                  │

                         │   135 │                                         │

                         │   136 │   try:                                  │

                         │ ❱ 137 │   │   await run(_load, model)           │

                         │   138 │   │   return model                      │

                         │   139 │   except (OSError, InvalidProtobuf, Bad │

                         │   140 │   │   log.warning(                      │

                         │                                                 │

                         │ /usr/src/app/main.py:125 in run                 │

                         │                                                 │

                         │   122 async def run(func: Callable[..., Any], i │

                         │   123 │   if thread_pool is None:               │

                         │   124 │   │   return func(inputs)               │

                         │ ❱ 125 │   return await asyncio.get_running_loop │

                         │   126                                           │

                         │   127                                           │

                         │   128 async def load(model: InferenceModel) ->  │

                         │                                                 │

                         │ /usr/lib/python3.10/concurrent/futures/thread.p │

                         │ y:58 in run                                     │

                         │                                                 │

                         │ /usr/src/app/main.py:134 in _load               │

                         │                                                 │

                         │   131 │                                         │

                         │   132 │   def _load(model: InferenceModel) -> N │

                         │   133 │   │   with lock:                        │

                         │ ❱ 134 │   │   │   model.load()                  │

                         │   135 │                                         │

                         │   136 │   try:                                  │

                         │   137 │   │   await run(_load, model)           │

                         │                                                 │

                         │ /usr/src/app/models/base.py:52 in load          │

                         │                                                 │

                         │    49 │   │   │   return                        │

                         │    50 │   │   self.download()                   │

                         │    51 │   │   log.info(f"Loading {self.model_ty │

                         │       to memory")                               │

                         │ ❱  52 │   │   self._load()                      │

                         │    53 │   │   self.loaded = True                │

                         │    54 │                                         │

                         │    55 │   def predict(self, inputs: Any, **mode │

                         │                                                 │

                         │ /usr/src/app/models/clip.py:146 in _load        │

                         │                                                 │

                         │   143 │   │   super().__init__(clean_name(model │

                         │   144 │                                         │

                         │   145 │   def _load(self) -> None:              │

                         │ ❱ 146 │   │   super()._load()                   │

                         │   147 │   │   self._load_tokenizer()            │

                         │   148 │   │                                     │

                         │   149 │   │   size: list[int] | int = self.prep │

                         │                                                 │

                         │ /usr/src/app/models/clip.py:36 in _load         │

                         │                                                 │

                         │    33 │   def _load(self) -> None:              │

                         │    34 │   │   if self.mode == "text" or self.mo │

                         │    35 │   │   │   log.debug(f"Loading clip text │

                         │ ❱  36 │   │   │   self.text_model = self._make_ │

                         │    37 │   │   │   log.debug(f"Loaded clip text  │

                         │    38 │   │                                     │

                         │    39 │   │   if self.mode == "vision" or self. │

                         │                                                 │

                         │ /usr/src/app/models/base.py:117 in              │

                         │ _make_session                                   │

                         │                                                 │

                         │   114 │   │   │   case ".armnn":                │

                         │   115 │   │   │   │   session = AnnSession(mode │

                         │   116 │   │   │   case ".onnx":                 │

                         │ ❱ 117 │   │   │   │   session = ort.InferenceSe │

                         │   118 │   │   │   │   │   model_path.as_posix() │

                         │   119 │   │   │   │   │   sess_options=self.ses │

                         │   120 │   │   │   │   │   providers=self.provid │

                         │                                                 │

                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │

                         │ ime/capi/onnxruntime_inference_collection.py:41 │

                         │ 9 in __init__                                   │

                         │                                                 │

                         │    416 │   │   disabled_optimizers = kwargs["di │

                         │        kwargs else None                         │

                         │    417 │   │                                    │

                         │    418 │   │   try:                             │

                         │ ❱  419 │   │   │   self._create_inference_sessi │

                         │        disabled_optimizers)                     │

                         │    420 │   │   except (ValueError, RuntimeError │

                         │    421 │   │   │   if self._enable_fallback:    │

                         │    422 │   │   │   │   try:                     │

                         │                                                 │

                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │

                         │ ime/capi/onnxruntime_inference_collection.py:48 │

                         │ 3 in _create_inference_session                  │

                         │                                                 │

                         │    480 │   │   │   disabled_optimizers = set(di │

                         │    481 │   │                                    │

                         │    482 │   │   # initialize the C++ InferenceSe │

                         │ ❱  483 │   │   sess.initialize_session(provider │

                         │    484 │   │                                    │

                         │    485 │   │   self._sess = sess                │

                         │    486 │   │   self._sess_options = self._sess. │

                         ╰─────────────────────────────────────────────────╯

                         RuntimeException: [ONNXRuntimeError] : 6 :         

                         RUNTIME_EXCEPTION : Encountered unknown exception  

                         in Initialize()                                    `

Search works well without openvino enabled.

My hwaccel.ml.yml:

`version: "3.8"

Configurations for hardware-accelerated machine learning

If using Unraid or another platform that doesn't allow multiple Compose files,

you can inline the config for a backend by copying its contents

into the immich-machine-learning service in the docker-compose.yml file.

See https://immich.app/docs/features/ml-hardware-acceleration for info on usage.

services: armnn: devices:

/dev/mali0:/dev/mali0 volumes:
/lib/firmware/mali_csffw.bin:/lib/firmware/mali_csffw.bin:ro # Mali firmware for your chipset (not always required depending on the driver)
/usr/lib/libmali.so:/usr/lib/libmali.so:ro # Mali driver for your chipset (always required)

cpu:

cuda: deploy: resources: reservations: devices:
- driver: nvidia count: 1 capabilities:
  - gpu
  - compute
  - video
openvino: device_cgroup_rules:
"c 189:* rmw" devices:
/dev/dri:/dev/dri volumes:
/dev/bus/usb:/dev/bus/usb

openvino-wsl: devices:
/dev/dri:/dev/dri
/dev/dxg:/dev/dxg volumes:
/dev/bus/usb:/dev/bus/usb
/usr/lib/wsl:/usr/lib/wsl `

I run an Asustor AS6702T NAS with an Intel Celeron N5105 CPU. This problem started with v1.99.0.

The OS that Immich Server is running on

ADM 4.2.6RPI1 (Asustor NAS)

Version of Immich Server

v1.100.0

Version of Immich Mobile App

(not used)

Platform with the issue

[X] Server
[X] Web
[ ] Mobile

Your docker-compose.yml content

version: "3.8"

#
# WARNING: Make sure to use the docker-compose.yml of the current release:
#
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
#
# The compose file on main may not be compatible with the latest release.
#

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    command: [ "start.sh", "immich" ]
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
      - ${EXTERNAL_LIBRARY}://mnt/media/external_library:ro
    env_file:
      - .env
    ports:
      - 2283:3001
    depends_on:
      - redis
      - database
    restart: always

  immich-microservices:
    container_name: immich_microservices
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    extends:
      file: hwaccel.yml
      service: hwaccel
    command: [ "start.sh", "microservices" ]
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
      - ${EXTERNAL_LIBRARY}://mnt/media/external_library:ro
    env_file:
      - .env
    depends_on:
      - redis
      - database
    restart: always

  immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-openvino
    # image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
    extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
      file: hwaccel.ml.yml
      service: openvino # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always

  redis:
    container_name: immich_redis
    image: redis:6.2-alpine@sha256:c5a607fb6e1bb15d32bbcf14db22787d19e428d59e31a5da67511b49bb0f1ccc
    restart: always

  database:
    container_name: immich_postgres
    image: tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    env_file:
      - .env
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
    volumes:
      - pgdata:/var/lib/postgresql/data
    restart: always

volumes:
  pgdata:
  model-cache:

Your .env content

# You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables

# The location where your uploaded files are stored
UPLOAD_LOCATION=/volume1/Docker/immich/upload_location

EXTERNAL_LIBRARY=/volume2/...

# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release

# Connection secret for postgres. You should change it to a random password
DB_PASSWORD=

# The values below this line do not need to be changed
###################################################################################
DB_HOSTNAME=immich_postgres
DB_USERNAME=postgres
DB_DATABASE_NAME=immich

REDIS_HOSTNAME=immich_redis

Reproduction steps

1. Enable the openvino machine learning image / container
2. Try to search something using the web interface

Additional information

No response

aviv926 commented 6 months ago

It seems that the hwaccel.ml.yml file version Out of date, try downloading the newest one from the release page and check accordingly.

fwmone commented 6 months ago

Thanks! Yes, you were right, my config files were outdated - sorry for that. I updated all of them, but the problem still stays.

docker-compose.yml:

`version: "3.8"

#

WARNING: Make sure to use the docker-compose.yml of the current release:

#

https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml

#

The compose file on main may not be compatible with the latest release.

#

name: immich

services: immich-server: container_name: immich_server image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release} command: ['start.sh', 'immich'] volumes:

${UPLOAD_LOCATION}:/usr/src/app/upload
/etc/localtime:/etc/localtime:ro
${EXTERNAL_LIBRARY}://mnt/media/external_library:ro env_file:
.env ports:
2283:3001 depends_on:
redis
database restart: always

immich-microservices: container_name: immich_microservices image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release} extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/hardware-transcoding file: hwaccel.transcoding.yml service: quicksync # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding command: ['start.sh', 'microservices'] volumes:
${UPLOAD_LOCATION}:/usr/src/app/upload
/etc/localtime:/etc/localtime:ro
${EXTERNAL_LIBRARY}://mnt/media/external_library:ro env_file:
.env depends_on:
redis
database restart: always

immich-machine-learning: container_name: immich_machine_learning

For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.

Example tag: ${IMMICH_VERSION:-release}-cuda

image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-openvino

image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}

extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration file: hwaccel.ml.yml service: openvino # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the -wsl version for WSL2 where applicable volumes:
model-cache:/cache env_file:
.env restart: always

redis: container_name: immich_redis image: registry.hub.docker.com/library/redis:6.2-alpine@sha256:51d6c56749a4243096327e3fb964a48ed92254357108449cb6e23999c37773c5 restart: always

database: container_name: immich_postgres image: registry.hub.docker.com/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0 env_file:
.env environment: POSTGRES_PASSWORD: ${DB_PASSWORD} POSTGRES_USER: ${DB_USERNAME} POSTGRES_DB: ${DB_DATABASE_NAME} volumes:
pgdata:/var/lib/postgresql/data restart: always

volumes: pgdata: model-cache: `

hwaccel.ml.yml:

`version: "3.8"

Configurations for hardware-accelerated machine learning

If using Unraid or another platform that doesn't allow multiple Compose files,

you can inline the config for a backend by copying its contents

into the immich-machine-learning service in the docker-compose.yml file.

See https://immich.app/docs/features/ml-hardware-acceleration for info on usage.

services: armnn: devices:

/dev/mali0:/dev/mali0 volumes:
/lib/firmware/mali_csffw.bin:/lib/firmware/mali_csffw.bin:ro # Mali firmware for your chipset (not always required depending on the driver)
/usr/lib/libmali.so:/usr/lib/libmali.so:ro # Mali driver for your chipset (always required)

cpu: {}

cuda: deploy: resources: reservations: devices:
- driver: nvidia count: 1 capabilities:
  - gpu
  - compute
  - video
openvino: device_cgroup_rules:
"c 189:* rmw" devices:
/dev/dri:/dev/dri volumes:
/dev/bus/usb:/dev/bus/usb

openvino-wsl: devices:
/dev/dri:/dev/dri
/dev/dxg:/dev/dxg volumes:
/dev/bus/usb:/dev/bus/usb
/usr/lib/wsl:/usr/lib/wsl `

Log:

`[03/29/24 16:57:34] INFO Starting gunicorn 21.2.0

[03/29/24 16:57:34] INFO Using worker: app.config.CustomUvicornWorker
[03/29/24 16:57:34] INFO Booting worker with pid: 13
[03/29/24 16:57:41] INFO Started server process [13]
[03/29/24 16:57:41] INFO Waiting for application startup.
[03/29/24 16:57:41] INFO Created in-memory cache with unloading after 300s
of inactivity.
[03/29/24 16:57:41] INFO Initialized request thread pool with 4 threads.
[03/29/24 16:57:41] INFO Application startup complete.
[03/29/24 16:59:38] INFO Setting 'ViT-B-32openai' execution providers to
['OpenVINOExecutionProvider',
'CPUExecutionProvider'], in descending order of
preference
[03/29/24 16:59:38] INFO Loading clip model 'ViT-B-32openai' to memory
2024-03-29 16:59:42.034430019 [E:onnxruntime:, inference_session.cc:1985 Initialize] Encountered unknown exception in Initialize() [03/29/24 16:59:42] ERROR Exception in ASGI application

                         ╭─────── Traceback (most recent call last) ───────╮
                         │ /usr/src/app/main.py:116 in predict             │
                         │                                                 │
                         │   113 │   except orjson.JSONDecodeError:        │
                         │   114 │   │   raise HTTPException(400, f"Invali │
                         │   115 │                                         │
                         │ ❱ 116 │   model = await load(await model_cache. │
                         │       ttl=settings.model_ttl, **kwargs))        │
                         │   117 │   model.configure(**kwargs)             │
                         │   118 │   outputs = await run(model.predict, in │
                         │   119 │   return ORJSONResponse(outputs)        │
                         │                                                 │
                         │ /usr/src/app/main.py:137 in load                │
                         │                                                 │
                         │   134 │   │   │   model.load()                  │
                         │   135 │                                         │
                         │   136 │   try:                                  │
                         │ ❱ 137 │   │   await run(_load, model)           │
                         │   138 │   │   return model                      │
                         │   139 │   except (OSError, InvalidProtobuf, Bad │
                         │   140 │   │   log.warning(                      │
                         │                                                 │
                         │ /usr/src/app/main.py:125 in run                 │
                         │                                                 │
                         │   122 async def run(func: Callable[..., Any], i │
                         │   123 │   if thread_pool is None:               │
                         │   124 │   │   return func(inputs)               │
                         │ ❱ 125 │   return await asyncio.get_running_loop │
                         │   126                                           │
                         │   127                                           │
                         │   128 async def load(model: InferenceModel) ->  │
                         │                                                 │
                         │ /usr/lib/python3.10/concurrent/futures/thread.p │
                         │ y:58 in run                                     │
                         │                                                 │
                         │ /usr/src/app/main.py:134 in _load               │
                         │                                                 │
                         │   131 │                                         │
                         │   132 │   def _load(model: InferenceModel) -> N │
                         │   133 │   │   with lock:                        │
                         │ ❱ 134 │   │   │   model.load()                  │
                         │   135 │                                         │
                         │   136 │   try:                                  │
                         │   137 │   │   await run(_load, model)           │
                         │                                                 │
                         │ /usr/src/app/models/base.py:52 in load          │
                         │                                                 │
                         │    49 │   │   │   return                        │
                         │    50 │   │   self.download()                   │
                         │    51 │   │   log.info(f"Loading {self.model_ty │
                         │       to memory")                               │
                         │ ❱  52 │   │   self._load()                      │
                         │    53 │   │   self.loaded = True                │
                         │    54 │                                         │
                         │    55 │   def predict(self, inputs: Any, **mode │
                         │                                                 │
                         │ /usr/src/app/models/clip.py:146 in _load        │
                         │                                                 │
                         │   143 │   │   super().__init__(clean_name(model │
                         │   144 │                                         │
                         │   145 │   def _load(self) -> None:              │
                         │ ❱ 146 │   │   super()._load()                   │
                         │   147 │   │   self._load_tokenizer()            │
                         │   148 │   │                                     │
                         │   149 │   │   size: list[int] | int = self.prep │
                         │                                                 │
                         │ /usr/src/app/models/clip.py:36 in _load         │
                         │                                                 │
                         │    33 │   def _load(self) -> None:              │
                         │    34 │   │   if self.mode == "text" or self.mo │
                         │    35 │   │   │   log.debug(f"Loading clip text │
                         │ ❱  36 │   │   │   self.text_model = self._make_ │
                         │    37 │   │   │   log.debug(f"Loaded clip text  │
                         │    38 │   │                                     │
                         │    39 │   │   if self.mode == "vision" or self. │
                         │                                                 │
                         │ /usr/src/app/models/base.py:117 in              │
                         │ _make_session                                   │
                         │                                                 │
                         │   114 │   │   │   case ".armnn":                │
                         │   115 │   │   │   │   session = AnnSession(mode │
                         │   116 │   │   │   case ".onnx":                 │
                         │ ❱ 117 │   │   │   │   session = ort.InferenceSe │
                         │   118 │   │   │   │   │   model_path.as_posix() │
                         │   119 │   │   │   │   │   sess_options=self.ses │
                         │   120 │   │   │   │   │   providers=self.provid │
                         │                                                 │
                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │
                         │ ime/capi/onnxruntime_inference_collection.py:41 │
                         │ 9 in __init__                                   │
                         │                                                 │
                         │    416 │   │   disabled_optimizers = kwargs["di │
                         │        kwargs else None                         │
                         │    417 │   │                                    │
                         │    418 │   │   try:                             │
                         │ ❱  419 │   │   │   self._create_inference_sessi │
                         │        disabled_optimizers)                     │
                         │    420 │   │   except (ValueError, RuntimeError │
                         │    421 │   │   │   if self._enable_fallback:    │
                         │    422 │   │   │   │   try:                     │
                         │                                                 │
                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │
                         │ ime/capi/onnxruntime_inference_collection.py:48 │
                         │ 3 in _create_inference_session                  │
                         │                                                 │
                         │    480 │   │   │   disabled_optimizers = set(di │
                         │    481 │   │                                    │
                         │    482 │   │   # initialize the C++ InferenceSe │
                         │ ❱  483 │   │   sess.initialize_session(provider │
                         │    484 │   │                                    │
                         │    485 │   │   self._sess = sess                │
                         │    486 │   │   self._sess_options = self._sess. │
                         ╰─────────────────────────────────────────────────╯
                         RuntimeException: [ONNXRuntimeError] : 6 :         
                         RUNTIME_EXCEPTION : Encountered unknown exception  
                         in Initialize()

[03/29/24 17:00:25] INFO Loading clip model 'ViT-B-32__openai' to memory
2024-03-29 17:00:28.560632989 [E:onnxruntime:, inference_session.cc:1985 Initialize] Encountered unknown exception in Initialize() [03/29/24 17:00:28] ERROR Exception in ASGI application

                         ╭─────── Traceback (most recent call last) ───────╮
                         │ /usr/src/app/main.py:116 in predict             │
                         │                                                 │
                         │   113 │   except orjson.JSONDecodeError:        │
                         │   114 │   │   raise HTTPException(400, f"Invali │
                         │   115 │                                         │
                         │ ❱ 116 │   model = await load(await model_cache. │
                         │       ttl=settings.model_ttl, **kwargs))        │
                         │   117 │   model.configure(**kwargs)             │
                         │   118 │   outputs = await run(model.predict, in │
                         │   119 │   return ORJSONResponse(outputs)        │
                         │                                                 │
                         │ /usr/src/app/main.py:137 in load                │
                         │                                                 │
                         │   134 │   │   │   model.load()                  │
                         │   135 │                                         │
                         │   136 │   try:                                  │
                         │ ❱ 137 │   │   await run(_load, model)           │
                         │   138 │   │   return model                      │
                         │   139 │   except (OSError, InvalidProtobuf, Bad │
                         │   140 │   │   log.warning(                      │
                         │                                                 │
                         │ /usr/src/app/main.py:125 in run                 │
                         │                                                 │
                         │   122 async def run(func: Callable[..., Any], i │
                         │   123 │   if thread_pool is None:               │
                         │   124 │   │   return func(inputs)               │
                         │ ❱ 125 │   return await asyncio.get_running_loop │
                         │   126                                           │
                         │   127                                           │
                         │   128 async def load(model: InferenceModel) ->  │
                         │                                                 │
                         │ /usr/lib/python3.10/concurrent/futures/thread.p │
                         │ y:58 in run                                     │
                         │                                                 │
                         │ /usr/src/app/main.py:134 in _load               │
                         │                                                 │
                         │   131 │                                         │
                         │   132 │   def _load(model: InferenceModel) -> N │
                         │   133 │   │   with lock:                        │
                         │ ❱ 134 │   │   │   model.load()                  │
                         │   135 │                                         │
                         │   136 │   try:                                  │
                         │   137 │   │   await run(_load, model)           │
                         │                                                 │
                         │ /usr/src/app/models/base.py:52 in load          │
                         │                                                 │
                         │    49 │   │   │   return                        │
                         │    50 │   │   self.download()                   │
                         │    51 │   │   log.info(f"Loading {self.model_ty │
                         │       to memory")                               │
                         │ ❱  52 │   │   self._load()                      │
                         │    53 │   │   self.loaded = True                │
                         │    54 │                                         │
                         │    55 │   def predict(self, inputs: Any, **mode │
                         │                                                 │
                         │ /usr/src/app/models/clip.py:146 in _load        │
                         │                                                 │
                         │   143 │   │   super().__init__(clean_name(model │
                         │   144 │                                         │
                         │   145 │   def _load(self) -> None:              │
                         │ ❱ 146 │   │   super()._load()                   │
                         │   147 │   │   self._load_tokenizer()            │
                         │   148 │   │                                     │
                         │   149 │   │   size: list[int] | int = self.prep │
                         │                                                 │
                         │ /usr/src/app/models/clip.py:36 in _load         │
                         │                                                 │
                         │    33 │   def _load(self) -> None:              │
                         │    34 │   │   if self.mode == "text" or self.mo │
                         │    35 │   │   │   log.debug(f"Loading clip text │
                         │ ❱  36 │   │   │   self.text_model = self._make_ │
                         │    37 │   │   │   log.debug(f"Loaded clip text  │
                         │    38 │   │                                     │
                         │    39 │   │   if self.mode == "vision" or self. │
                         │                                                 │
                         │ /usr/src/app/models/base.py:117 in              │
                         │ _make_session                                   │
                         │                                                 │
                         │   114 │   │   │   case ".armnn":                │
                         │   115 │   │   │   │   session = AnnSession(mode │
                         │   116 │   │   │   case ".onnx":                 │
                         │ ❱ 117 │   │   │   │   session = ort.InferenceSe │
                         │   118 │   │   │   │   │   model_path.as_posix() │
                         │   119 │   │   │   │   │   sess_options=self.ses │
                         │   120 │   │   │   │   │   providers=self.provid │
                         │                                                 │
                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │
                         │ ime/capi/onnxruntime_inference_collection.py:41 │
                         │ 9 in __init__                                   │
                         │                                                 │
                         │    416 │   │   disabled_optimizers = kwargs["di │
                         │        kwargs else None                         │
                         │    417 │   │                                    │
                         │    418 │   │   try:                             │
                         │ ❱  419 │   │   │   self._create_inference_sessi │
                         │        disabled_optimizers)                     │
                         │    420 │   │   except (ValueError, RuntimeError │
                         │    421 │   │   │   if self._enable_fallback:    │
                         │    422 │   │   │   │   try:                     │
                         │                                                 │
                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │
                         │ ime/capi/onnxruntime_inference_collection.py:48 │
                         │ 3 in _create_inference_session                  │
                         │                                                 │
                         │    480 │   │   │   disabled_optimizers = set(di │
                         │    481 │   │                                    │
                         │    482 │   │   # initialize the C++ InferenceSe │
                         │ ❱  483 │   │   sess.initialize_session(provider │
                         │    484 │   │                                    │
                         │    485 │   │   self._sess = sess                │
                         │    486 │   │   self._sess_options = self._sess. │
                         ╰─────────────────────────────────────────────────╯
                         RuntimeException: [ONNXRuntimeError] : 6 :         
                         RUNTIME_EXCEPTION : Encountered unknown exception  
                         in Initialize()                                    `

mertalev commented 6 months ago

1.99 updated to a newer version of OpenVINO. This actually fixed smart search for most users from what I've seen, so interesting that it broke it for you. Unfortunately, there isn't much I can do here. It's surprisingly difficult to make OpenVINO work for everyone.

reef-actor commented 5 months ago

I am seeing this issue on my humble Intel J5005 (Gemini Lake). This is a fresh install as of yesterday evening

Installed and configured following the docs
Enabled OpenVino
Added a single RO external library

I have over 10GB unallocated gunicorn appears in intel_gpu_top but no usage

OpenVino inference is working nicely for Frigate NVR, but perhaps this model is too heavy for the iGPU? I'm happy to wipe/experiment with my installation if I can help in any way.

The log below is for a single face detection attempt (concurrency = 1)

immich_machine_learning logs

[04/23/24 10:13:26] INFO Starting gunicorn 22.0.0 [04/23/24 10:13:26] INFO Listening at: http://[::]:3003 (9) [04/23/24 10:13:26] INFO Using worker: app.config.CustomUvicornWorker [04/23/24 10:13:26] INFO Booting worker with pid: 13 [04/23/24 10:13:27] DEBUG Could not load ANN shared libraries, using ONNX: libmali.so: cannot open shared object file: No such file or directory [04/23/24 10:13:34] INFO Started server process [13] [04/23/24 10:13:34] INFO Waiting for application startup. [04/23/24 10:13:34] INFO Created in-memory cache with unloading after 300s of inactivity. [04/23/24 10:13:34] INFO Initialized request thread pool with 4 threads. [04/23/24 10:13:34] DEBUG Checking for inactivity... [04/23/24 10:13:34] INFO Application startup complete. [04/23/24 10:13:44] DEBUG Checking for inactivity... [04/23/24 10:13:52] DEBUG Available ORT providers: {'CPUExecutionProvider', 'OpenVINOExecutionProvider'} [04/23/24 10:13:52] DEBUG Available OpenVINO devices: ['CPU', 'GPU'] [04/23/24 10:13:52] INFO Setting 'buffalo_l' execution providers to ['OpenVINOExecutionProvider', 'CPUExecutionProvider'], in descending order of preference [04/23/24 10:13:52] DEBUG Setting execution provider options to [{'device_type': 'GPU_FP32', 'cache_dir': '/cache/facial-recognition/buffalo_l/openvino'}, {'arena_extend_strategy': 'kSameAsRequested'}] [04/23/24 10:13:52] DEBUG Setting execution_mode to ORT_SEQUENTIAL [04/23/24 10:13:52] DEBUG Setting inter_op_num_threads to 0 [04/23/24 10:13:52] DEBUG Setting intra_op_num_threads to 0 [04/23/24 10:13:52] DEBUG Setting preferred runtime to onnx [04/23/24 10:13:52] INFO Loading facial recognition model 'buffalo_l' to memory 2024-04-23 10:13:53.433384806 [E:onnxruntime:, inference_session.cc:1985 Initialize] Encountered unknown exception in Initialize() [04/23/24 10:13:53] ERROR Exception in ASGI application ╭─────── Traceback (most recent call last) ───────╮ │ /usr/src/app/main.py:116 in predict │ │ │ │ 113 │ except orjson.JSONDecodeError: │ │ 114 │ │ raise HTTPException(400, f"Invali │ │ 115 │ │ │ ❱ 116 │ model = await load(await model_cache. │ │ ttl=settings.model_ttl, **kwargs)) │ │ 117 │ model.configure(**kwargs) │ │ 118 │ outputs = await run(model.predict, in │ │ 119 │ return ORJSONResponse(outputs) │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ image = UploadFile(filename='blob', │ │ │ │ size=491485, │ │ │ │ headers=Headers({'content-dis… │ │ │ │ 'form-data; name="image"; │ │ │ │ filename="blob"', │ │ │ │ 'content-type': │ │ │ │ 'application/octet-stream'})) │ │ │ │ inputs = b'\xff\xd8\xff\xe2\x01\xf0ICC… │ │ │ │ \x00\x00mntrRGB XYZ │ │ │ │ \x07\xe2\x00\x03\x00\x14\x00\… │ │ │ │ kwargs = { │ │ │ │ │ 'minScore': 0.7, │ │ │ │ │ 'maxDistance': 0.5, │ │ │ │ │ 'minFaces': 3 │ │ │ │ } │ │ │ │ model_name = 'buffalo_l' │ │ │ │ model_type = │ │ │ │ options = '{"minScore":0.7,"maxDistance… │ │ │ │ text = None │ │ │ ╰─────────────────────────────────────────────╯ │ │ │ │ /usr/src/app/main.py:137 in load │ │ │ │ 134 │ │ │ model.load() │ │ 135 │ │ │ 136 │ try: │ │ ❱ 137 │ │ await run(_load, model) │ │ 138 │ │ return model │ │ 139 │ except (OSError, InvalidProtobuf, Bad │ │ 140 │ │ log.warning( │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ _load = ._load at │ │ │ │ 0x7f86807c0af0> │ │ │ │ model = │ │ │ ╰─────────────────────────────────────────────╯ │ │ │ │ /usr/src/app/main.py:125 in run │ │ │ │ 122 async def run(func: Callable[..., Any], i │ │ 123 │ if thread_pool is None: │ │ 124 │ │ return func(inputs) │ │ ❱ 125 │ return await asyncio.get_running_loop │ │ 126 │ │ 127 │ │ 128 async def load(model: InferenceModel) -> │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ func = ._load at │ │ │ │ 0x7f86807c0af0> │ │ │ │ inputs = │ │ │ ╰─────────────────────────────────────────────╯ │ │ │ │ /usr/lib/python3.10/concurrent/futures/thread.p │ │ y:58 in run │ │ │ │ /usr/src/app/main.py:134 in _load │ │ │ │ 131 │ │ │ 132 │ def _load(model: InferenceModel) -> N │ │ 133 │ │ with lock: │ │ ❱ 134 │ │ │ model.load() │ │ 135 │ │ │ 136 │ try: │ │ 137 │ │ await run(_load, model) │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ model = │ │ │ ╰─────────────────────────────────────────────╯ │ │ │ │ /usr/src/app/models/base.py:52 in load │ │ │ │ 49 │ │ │ return │ │ 50 │ │ self.download() │ │ 51 │ │ log.info(f"Loading {self.model_ty │ │ to memory") │ │ ❱ 52 │ │ self._load() │ │ 53 │ │ self.loaded = True │ │ 54 │ │ │ 55 │ def predict(self, inputs: Any, **mode │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ self = │ │ │ ╰─────────────────────────────────────────────╯ │ │ │ │ /usr/src/app/models/facial_recognition.py:30 in │ │ _load │ │ │ │ 27 │ │ super().__init__(clean_name(model_ │ │ 28 │ │ │ 29 │ def _load(self) -> None: │ │ ❱ 30 │ │ self.det_model = RetinaFace(sessio │ │ 31 │ │ self.rec_model = ArcFaceONNX( │ │ 32 │ │ │ self.rec_file.with_suffix(".on │ │ 33 │ │ │ session=self._make_session(sel │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ self = │ │ │ ╰─────────────────────────────────────────────╯ │ │ │ │ /usr/src/app/models/base.py:117 in │ │ _make_session │ │ │ │ 114 │ │ │ case ".armnn": │ │ 115 │ │ │ │ session = AnnSession(mode │ │ 116 │ │ │ case ".onnx": │ │ ❱ 117 │ │ │ │ session = ort.InferenceSe │ │ 118 │ │ │ │ │ model_path.as_posix() │ │ 119 │ │ │ │ │ sess_options=self.ses │ │ 120 │ │ │ │ │ providers=self.provid │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ model_path = PosixPath('/cache/facial-reco… │ │ │ │ self = │ │ │ ╰─────────────────────────────────────────────╯ │ │ │ │ /opt/venv/lib/python3.10/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:41 │ │ 9 in __init__ │ │ │ │ 416 │ │ disabled_optimizers = kwargs["di │ │ kwargs else None │ │ 417 │ │ │ │ 418 │ │ try: │ │ ❱ 419 │ │ │ self._create_inference_sessi │ │ disabled_optimizers) │ │ 420 │ │ except (ValueError, RuntimeError │ │ 421 │ │ │ if self._enable_fallback: │ │ 422 │ │ │ │ try: │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ disabled_optimizers = None │ │ │ │ kwargs = {} │ │ │ │ path_or_bytes = '/cache/facial-recog… │ │ │ │ provider_options = [ │ │ │ │ │ { │ │ │ │ │ │ │ │ │ │ 'device_type': │ │ │ │ 'GPU_FP32', │ │ │ │ │ │ 'cache_dir': │ │ │ │ '/cache/facial-recog… │ │ │ │ │ }, │ │ │ │ │ { │ │ │ │ │ │ │ │ │ │ 'arena_extend_strate… │ │ │ │ 'kSameAsRequested' │ │ │ │ │ } │ │ │ │ ] │ │ │ │ providers = [ │ │ │ │ │ │ │ │ │ 'OpenVINOExecutionPr… │ │ │ │ │ │ │ │ │ 'CPUExecutionProvide… │ │ │ │ ] │ │ │ │ self = │ │ │ │ sess_options = │ │ │ ╰─────────────────────────────────────────────╯ │ │ │ │ /opt/venv/lib/python3.10/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:48 │ │ 3 in _create_inference_session │ │ │ │ 480 │ │ │ disabled_optimizers = set(di │ │ 481 │ │ │ │ 482 │ │ # initialize the C++ InferenceSe │ │ ❱ 483 │ │ sess.initialize_session(provider │ │ 484 │ │ │ │ 485 │ │ self._sess = sess │ │ 486 │ │ self._sess_options = self._sess. │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ available_providers = [ │ │ │ │ │ │ │ │ │ 'OpenVINOExecutionPr… │ │ │ │ │ │ │ │ │ 'CPUExecutionProvide… │ │ │ │ ] │ │ │ │ disabled_optimizers = set() │ │ │ │ provider_options = [ │ │ │ │ │ { │ │ │ │ │ │ │ │ │ │ 'device_type': │ │ │ │ 'GPU_FP32', │ │ │ │ │ │ 'cache_dir': │ │ │ │ '/cache/facial-recog… │ │ │ │ │ }, │ │ │ │ │ { │ │ │ │ │ │ │ │ │ │ 'arena_extend_strate… │ │ │ │ 'kSameAsRequested' │ │ │ │ │ } │ │ │ │ ] │ │ │ │ providers = [ │ │ │ │ │ │ │ │ │ 'OpenVINOExecutionPr… │ │ │ │ │ │ │ │ │ 'CPUExecutionProvide… │ │ │ │ ] │ │ │ │ self = │ │ │ │ sess = │ │ │ │ session_options = │ │ │ ╰─────────────────────────────────────────────╯ │ ╰─────────────────────────────────────────────────╯ RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Encountered unknown exception in Initialize()

omltcat commented 5 months ago

I am on N5105 as well, with 16GB RAM. Can confirm having the exact same error when running 1.99.0 above. If switching to immich-machine-learning:v1.98.2-openvino everything works.

With MACHINE_LEARNING_PRELOAD__CLIP=ViT-B-32__openai, the error shows immediately after start:

immich-machine-learning logs

```python [04/24/24 20:57:02] INFO Loading clip model 'ViT-B-32__openai' to memory 2024-04-24 20:57:03.939642959 [E:onnxruntime:, inference_session.cc:1985 Initialize] Encountered unknown exception in Initialize() [04/24/24 20:57:04] ERROR Traceback (most recent call last): File "/opt/venv/lib/python3.10/site-packages/starlette/r outing.py", line 734, in lifespan async with self.lifespan_context(app) as maybe_state: File "/usr/lib/python3.10/contextlib.py", line 199, in __aenter__ return await anext(self.gen) File "/usr/src/app/main.py", line 55, in lifespan await preload_models(settings.preload) File "/usr/src/app/main.py", line 69, in preload_models await load(await model_cache.get(preload_models.clip, ModelType.CLIP)) File "/usr/src/app/main.py", line 137, in load await run(_load, model) File "/usr/src/app/main.py", line 125, in run return await asyncio.get_running_loop().run_in_executor(thread_p ool, func, inputs) File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) File "/usr/src/app/main.py", line 134, in _load model.load() File "/usr/src/app/models/base.py", line 52, in load self._load() File "/usr/src/app/models/clip.py", line 146, in _load super()._load() File "/usr/src/app/models/clip.py", line 36, in _load self.text_model = self._make_session(self.textual_path) File "/usr/src/app/models/base.py", line 117, in _make_session session = ort.InferenceSession( File "/opt/venv/lib/python3.10/site-packages/onnxruntime /capi/onnxruntime_inference_collection.py", line 419, in __init__ self._create_inference_session(providers, provider_options, disabled_optimizers) File "/opt/venv/lib/python3.10/site-packages/onnxruntime /capi/onnxruntime_inference_collection.py", line 483, in _create_inference_session sess.initialize_session(providers, provider_options, disabled_optimizers) onnxruntime.capi.onnxruntime_pybind11_state.Runtime Exception: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Encountered unknown exception in Initialize() ```

stefano99 commented 3 months ago

I'm getting the same error on a i5 6200u (gpu hd520), with Immich version 1.104.6, and since V1.100 (the first time I tried using the gpu for hw acceleration) Transcoding works fine.

It's running in a privileged LXC on the latest version of proxmox, 8gb of ram dedicated.

I'm not able, though, to try the 1.98.2 version of the machine learning container because it gives me an error on the server container:

server container error with 1.98.2 machine learning container

[Nest] 7 - 06/14/2024, 11:32:55 AM ERROR [Microservices:JobService] Unable to run job handler (faceDetection/face-detection): Error: Machine learning request '{"facial-recognition":{"detection":{"modelName":"buffalo_l","options":{"minScore":0.7}},"recognition":{"modelName":"buffalo_l"}}}' failed with status 422: Unprocessable Entity [Nest] 7 - 06/14/2024, 11:32:55 AM ERROR [Microservices:JobService] Error: Machine learning request '{"facial-recognition":{"detection":{"modelName":"buffalo_l","options":{"minScore":0.7}},"recognition":{"modelName":"buffalo_l"}}}' failed with status 422: Unprocessable Entity at MachineLearningRepository.predict (/usr/src/app/dist/repositories/machine-learning.repository.js:22:19) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async MachineLearningRepository.detectFaces (/usr/src/app/dist/repositories/machine-learning.repository.js:33:26) at async PersonService.handleDetectFaces (/usr/src/app/dist/services/person.service.js:274:52) at async /usr/src/app/dist/services/job.service.js:148:36 at async Worker.processJob (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:394:28) at async Worker.retryIfFailed (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:581:24) [Nest] 7 - 06/14/2024, 11:32:55 AM ERROR [Microservices:JobService] Object: { "id": "2c44089e-a5ab-4b47-894e-21ad138371b0" } [Nest] 17 - 06/14/2024, 11:33:53 AM LOG [Api:EventRepository] Websocket Disconnect: yaBQqD1_Clm1qZRGAAAJ

mertalev commented 2 months ago

This should be fixed as of the current release. Be sure to delete the model cache volume so it downloads the updated models.

fwmone commented 2 months ago

Looks good to me - thanks a bunch!

immich-app / immich

Search not working when using OpenVINO #8353

The bug

Configurations for hardware-accelerated machine learning

If using Unraid or another platform that doesn't allow multiple Compose files,

you can inline the config for a backend by copying its contents

into the immich-machine-learning service in the docker-compose.yml file.

See https://immich.app/docs/features/ml-hardware-acceleration for info on usage.

The OS that Immich Server is running on

Version of Immich Server

Version of Immich Mobile App

Platform with the issue

Your docker-compose.yml content

Your .env content

Reproduction steps

Additional information

WARNING: Make sure to use the docker-compose.yml of the current release:

https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml

The compose file on main may not be compatible with the latest release.

For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.

Example tag: ${IMMICH_VERSION:-release}-cuda

image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}

Configurations for hardware-accelerated machine learning

If using Unraid or another platform that doesn't allow multiple Compose files,

you can inline the config for a backend by copying its contents

into the immich-machine-learning service in the docker-compose.yml file.

See https://immich.app/docs/features/ml-hardware-acceleration for info on usage.