immich-app / immich

High performance self-hosted photo and video management solution.
https://immich.app
GNU Affero General Public License v3.0
47.36k stars 2.41k forks source link

Search not working when using OpenVINO #8353

Closed fwmone closed 2 months ago

fwmone commented 6 months ago

The bug

When enabling the immich-machine-learning openvino image, search is not working anymore but throws an HTTP 500 error. The immich-machine-learning log throws:

`[03/29/24 09:48:06] INFO Loading clip model 'ViT-B-32__openai' to memory

2024-03-29 09:48:09.988640928 [E:onnxruntime:, inference_session.cc:1985 Initialize] Encountered unknown exception in Initialize()

[03/29/24 09:48:09] ERROR Exception in ASGI application

                         ╭─────── Traceback (most recent call last) ───────╮

                         │ /usr/src/app/main.py:116 in predict             │

                         │                                                 │

                         │   113 │   except orjson.JSONDecodeError:        │

                         │   114 │   │   raise HTTPException(400, f"Invali │

                         │   115 │                                         │

                         │ ❱ 116 │   model = await load(await model_cache. │

                         │       ttl=settings.model_ttl, **kwargs))        │

                         │   117 │   model.configure(**kwargs)             │

                         │   118 │   outputs = await run(model.predict, in │

                         │   119 │   return ORJSONResponse(outputs)        │

                         │                                                 │

                         │ /usr/src/app/main.py:137 in load                │

                         │                                                 │

                         │   134 │   │   │   model.load()                  │

                         │   135 │                                         │

                         │   136 │   try:                                  │

                         │ ❱ 137 │   │   await run(_load, model)           │

                         │   138 │   │   return model                      │

                         │   139 │   except (OSError, InvalidProtobuf, Bad │

                         │   140 │   │   log.warning(                      │

                         │                                                 │

                         │ /usr/src/app/main.py:125 in run                 │

                         │                                                 │

                         │   122 async def run(func: Callable[..., Any], i │

                         │   123 │   if thread_pool is None:               │

                         │   124 │   │   return func(inputs)               │

                         │ ❱ 125 │   return await asyncio.get_running_loop │

                         │   126                                           │

                         │   127                                           │

                         │   128 async def load(model: InferenceModel) ->  │

                         │                                                 │

                         │ /usr/lib/python3.10/concurrent/futures/thread.p │

                         │ y:58 in run                                     │

                         │                                                 │

                         │ /usr/src/app/main.py:134 in _load               │

                         │                                                 │

                         │   131 │                                         │

                         │   132 │   def _load(model: InferenceModel) -> N │

                         │   133 │   │   with lock:                        │

                         │ ❱ 134 │   │   │   model.load()                  │

                         │   135 │                                         │

                         │   136 │   try:                                  │

                         │   137 │   │   await run(_load, model)           │

                         │                                                 │

                         │ /usr/src/app/models/base.py:52 in load          │

                         │                                                 │

                         │    49 │   │   │   return                        │

                         │    50 │   │   self.download()                   │

                         │    51 │   │   log.info(f"Loading {self.model_ty │

                         │       to memory")                               │

                         │ ❱  52 │   │   self._load()                      │

                         │    53 │   │   self.loaded = True                │

                         │    54 │                                         │

                         │    55 │   def predict(self, inputs: Any, **mode │

                         │                                                 │

                         │ /usr/src/app/models/clip.py:146 in _load        │

                         │                                                 │

                         │   143 │   │   super().__init__(clean_name(model │

                         │   144 │                                         │

                         │   145 │   def _load(self) -> None:              │

                         │ ❱ 146 │   │   super()._load()                   │

                         │   147 │   │   self._load_tokenizer()            │

                         │   148 │   │                                     │

                         │   149 │   │   size: list[int] | int = self.prep │

                         │                                                 │

                         │ /usr/src/app/models/clip.py:36 in _load         │

                         │                                                 │

                         │    33 │   def _load(self) -> None:              │

                         │    34 │   │   if self.mode == "text" or self.mo │

                         │    35 │   │   │   log.debug(f"Loading clip text │

                         │ ❱  36 │   │   │   self.text_model = self._make_ │

                         │    37 │   │   │   log.debug(f"Loaded clip text  │

                         │    38 │   │                                     │

                         │    39 │   │   if self.mode == "vision" or self. │

                         │                                                 │

                         │ /usr/src/app/models/base.py:117 in              │

                         │ _make_session                                   │

                         │                                                 │

                         │   114 │   │   │   case ".armnn":                │

                         │   115 │   │   │   │   session = AnnSession(mode │

                         │   116 │   │   │   case ".onnx":                 │

                         │ ❱ 117 │   │   │   │   session = ort.InferenceSe │

                         │   118 │   │   │   │   │   model_path.as_posix() │

                         │   119 │   │   │   │   │   sess_options=self.ses │

                         │   120 │   │   │   │   │   providers=self.provid │

                         │                                                 │

                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │

                         │ ime/capi/onnxruntime_inference_collection.py:41 │

                         │ 9 in __init__                                   │

                         │                                                 │

                         │    416 │   │   disabled_optimizers = kwargs["di │

                         │        kwargs else None                         │

                         │    417 │   │                                    │

                         │    418 │   │   try:                             │

                         │ ❱  419 │   │   │   self._create_inference_sessi │

                         │        disabled_optimizers)                     │

                         │    420 │   │   except (ValueError, RuntimeError │

                         │    421 │   │   │   if self._enable_fallback:    │

                         │    422 │   │   │   │   try:                     │

                         │                                                 │

                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │

                         │ ime/capi/onnxruntime_inference_collection.py:48 │

                         │ 3 in _create_inference_session                  │

                         │                                                 │

                         │    480 │   │   │   disabled_optimizers = set(di │

                         │    481 │   │                                    │

                         │    482 │   │   # initialize the C++ InferenceSe │

                         │ ❱  483 │   │   sess.initialize_session(provider │

                         │    484 │   │                                    │

                         │    485 │   │   self._sess = sess                │

                         │    486 │   │   self._sess_options = self._sess. │

                         ╰─────────────────────────────────────────────────╯

                         RuntimeException: [ONNXRuntimeError] : 6 :         

                         RUNTIME_EXCEPTION : Encountered unknown exception  

                         in Initialize()                                    `

Search works well without openvino enabled.

My hwaccel.ml.yml:

`version: "3.8"

Configurations for hardware-accelerated machine learning

If using Unraid or another platform that doesn't allow multiple Compose files,

you can inline the config for a backend by copying its contents

into the immich-machine-learning service in the docker-compose.yml file.

See https://immich.app/docs/features/ml-hardware-acceleration for info on usage.

services: armnn: devices:

I run an Asustor AS6702T NAS with an Intel Celeron N5105 CPU. This problem started with v1.99.0.

The OS that Immich Server is running on

ADM 4.2.6RPI1 (Asustor NAS)

Version of Immich Server

v1.100.0

Version of Immich Mobile App

(not used)

Platform with the issue

Your docker-compose.yml content

version: "3.8"

#
# WARNING: Make sure to use the docker-compose.yml of the current release:
#
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
#
# The compose file on main may not be compatible with the latest release.
#

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    command: [ "start.sh", "immich" ]
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
      - ${EXTERNAL_LIBRARY}://mnt/media/external_library:ro
    env_file:
      - .env
    ports:
      - 2283:3001
    depends_on:
      - redis
      - database
    restart: always

  immich-microservices:
    container_name: immich_microservices
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    extends:
      file: hwaccel.yml
      service: hwaccel
    command: [ "start.sh", "microservices" ]
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
      - ${EXTERNAL_LIBRARY}://mnt/media/external_library:ro
    env_file:
      - .env
    depends_on:
      - redis
      - database
    restart: always

  immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-openvino
    # image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
    extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
      file: hwaccel.ml.yml
      service: openvino # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always

  redis:
    container_name: immich_redis
    image: redis:6.2-alpine@sha256:c5a607fb6e1bb15d32bbcf14db22787d19e428d59e31a5da67511b49bb0f1ccc
    restart: always

  database:
    container_name: immich_postgres
    image: tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    env_file:
      - .env
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
    volumes:
      - pgdata:/var/lib/postgresql/data
    restart: always

volumes:
  pgdata:
  model-cache:

Your .env content

# You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables

# The location where your uploaded files are stored
UPLOAD_LOCATION=/volume1/Docker/immich/upload_location

EXTERNAL_LIBRARY=/volume2/...

# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release

# Connection secret for postgres. You should change it to a random password
DB_PASSWORD=

# The values below this line do not need to be changed
###################################################################################
DB_HOSTNAME=immich_postgres
DB_USERNAME=postgres
DB_DATABASE_NAME=immich

REDIS_HOSTNAME=immich_redis

Reproduction steps

1. Enable the openvino machine learning image / container
2. Try to search something using the web interface

Additional information

No response

aviv926 commented 6 months ago

It seems that the hwaccel.ml.yml file version Out of date, try downloading the newest one from the release page and check accordingly.

fwmone commented 6 months ago

Thanks! Yes, you were right, my config files were outdated - sorry for that. I updated all of them, but the problem still stays.

docker-compose.yml:

`version: "3.8"

#

WARNING: Make sure to use the docker-compose.yml of the current release:

#

https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml

#

The compose file on main may not be compatible with the latest release.

#

name: immich

services: immich-server: container_name: immich_server image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release} command: ['start.sh', 'immich'] volumes:

volumes: pgdata: model-cache: `

hwaccel.ml.yml:

`version: "3.8"

Configurations for hardware-accelerated machine learning

If using Unraid or another platform that doesn't allow multiple Compose files,

you can inline the config for a backend by copying its contents

into the immich-machine-learning service in the docker-compose.yml file.

See https://immich.app/docs/features/ml-hardware-acceleration for info on usage.

services: armnn: devices:

Log:

`[03/29/24 16:57:34] INFO Starting gunicorn 21.2.0

[03/29/24 16:57:34] INFO Using worker: app.config.CustomUvicornWorker
[03/29/24 16:57:34] INFO Booting worker with pid: 13
[03/29/24 16:57:41] INFO Started server process [13]
[03/29/24 16:57:41] INFO Waiting for application startup.
[03/29/24 16:57:41] INFO Created in-memory cache with unloading after 300s
of inactivity.
[03/29/24 16:57:41] INFO Initialized request thread pool with 4 threads.
[03/29/24 16:57:41] INFO Application startup complete.
[03/29/24 16:59:38] INFO Setting 'ViT-B-32openai' execution providers to
['OpenVINOExecutionProvider',
'CPUExecutionProvider'], in descending order of
preference
[03/29/24 16:59:38] INFO Loading clip model 'ViT-B-32
openai' to memory
2024-03-29 16:59:42.034430019 [E:onnxruntime:, inference_session.cc:1985 Initialize] Encountered unknown exception in Initialize() [03/29/24 16:59:42] ERROR Exception in ASGI application

                         ╭─────── Traceback (most recent call last) ───────╮
                         │ /usr/src/app/main.py:116 in predict             │
                         │                                                 │
                         │   113 │   except orjson.JSONDecodeError:        │
                         │   114 │   │   raise HTTPException(400, f"Invali │
                         │   115 │                                         │
                         │ ❱ 116 │   model = await load(await model_cache. │
                         │       ttl=settings.model_ttl, **kwargs))        │
                         │   117 │   model.configure(**kwargs)             │
                         │   118 │   outputs = await run(model.predict, in │
                         │   119 │   return ORJSONResponse(outputs)        │
                         │                                                 │
                         │ /usr/src/app/main.py:137 in load                │
                         │                                                 │
                         │   134 │   │   │   model.load()                  │
                         │   135 │                                         │
                         │   136 │   try:                                  │
                         │ ❱ 137 │   │   await run(_load, model)           │
                         │   138 │   │   return model                      │
                         │   139 │   except (OSError, InvalidProtobuf, Bad │
                         │   140 │   │   log.warning(                      │
                         │                                                 │
                         │ /usr/src/app/main.py:125 in run                 │
                         │                                                 │
                         │   122 async def run(func: Callable[..., Any], i │
                         │   123 │   if thread_pool is None:               │
                         │   124 │   │   return func(inputs)               │
                         │ ❱ 125 │   return await asyncio.get_running_loop │
                         │   126                                           │
                         │   127                                           │
                         │   128 async def load(model: InferenceModel) ->  │
                         │                                                 │
                         │ /usr/lib/python3.10/concurrent/futures/thread.p │
                         │ y:58 in run                                     │
                         │                                                 │
                         │ /usr/src/app/main.py:134 in _load               │
                         │                                                 │
                         │   131 │                                         │
                         │   132 │   def _load(model: InferenceModel) -> N │
                         │   133 │   │   with lock:                        │
                         │ ❱ 134 │   │   │   model.load()                  │
                         │   135 │                                         │
                         │   136 │   try:                                  │
                         │   137 │   │   await run(_load, model)           │
                         │                                                 │
                         │ /usr/src/app/models/base.py:52 in load          │
                         │                                                 │
                         │    49 │   │   │   return                        │
                         │    50 │   │   self.download()                   │
                         │    51 │   │   log.info(f"Loading {self.model_ty │
                         │       to memory")                               │
                         │ ❱  52 │   │   self._load()                      │
                         │    53 │   │   self.loaded = True                │
                         │    54 │                                         │
                         │    55 │   def predict(self, inputs: Any, **mode │
                         │                                                 │
                         │ /usr/src/app/models/clip.py:146 in _load        │
                         │                                                 │
                         │   143 │   │   super().__init__(clean_name(model │
                         │   144 │                                         │
                         │   145 │   def _load(self) -> None:              │
                         │ ❱ 146 │   │   super()._load()                   │
                         │   147 │   │   self._load_tokenizer()            │
                         │   148 │   │                                     │
                         │   149 │   │   size: list[int] | int = self.prep │
                         │                                                 │
                         │ /usr/src/app/models/clip.py:36 in _load         │
                         │                                                 │
                         │    33 │   def _load(self) -> None:              │
                         │    34 │   │   if self.mode == "text" or self.mo │
                         │    35 │   │   │   log.debug(f"Loading clip text │
                         │ ❱  36 │   │   │   self.text_model = self._make_ │
                         │    37 │   │   │   log.debug(f"Loaded clip text  │
                         │    38 │   │                                     │
                         │    39 │   │   if self.mode == "vision" or self. │
                         │                                                 │
                         │ /usr/src/app/models/base.py:117 in              │
                         │ _make_session                                   │
                         │                                                 │
                         │   114 │   │   │   case ".armnn":                │
                         │   115 │   │   │   │   session = AnnSession(mode │
                         │   116 │   │   │   case ".onnx":                 │
                         │ ❱ 117 │   │   │   │   session = ort.InferenceSe │
                         │   118 │   │   │   │   │   model_path.as_posix() │
                         │   119 │   │   │   │   │   sess_options=self.ses │
                         │   120 │   │   │   │   │   providers=self.provid │
                         │                                                 │
                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │
                         │ ime/capi/onnxruntime_inference_collection.py:41 │
                         │ 9 in __init__                                   │
                         │                                                 │
                         │    416 │   │   disabled_optimizers = kwargs["di │
                         │        kwargs else None                         │
                         │    417 │   │                                    │
                         │    418 │   │   try:                             │
                         │ ❱  419 │   │   │   self._create_inference_sessi │
                         │        disabled_optimizers)                     │
                         │    420 │   │   except (ValueError, RuntimeError │
                         │    421 │   │   │   if self._enable_fallback:    │
                         │    422 │   │   │   │   try:                     │
                         │                                                 │
                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │
                         │ ime/capi/onnxruntime_inference_collection.py:48 │
                         │ 3 in _create_inference_session                  │
                         │                                                 │
                         │    480 │   │   │   disabled_optimizers = set(di │
                         │    481 │   │                                    │
                         │    482 │   │   # initialize the C++ InferenceSe │
                         │ ❱  483 │   │   sess.initialize_session(provider │
                         │    484 │   │                                    │
                         │    485 │   │   self._sess = sess                │
                         │    486 │   │   self._sess_options = self._sess. │
                         ╰─────────────────────────────────────────────────╯
                         RuntimeException: [ONNXRuntimeError] : 6 :         
                         RUNTIME_EXCEPTION : Encountered unknown exception  
                         in Initialize()                                    

[03/29/24 17:00:25] INFO Loading clip model 'ViT-B-32__openai' to memory
2024-03-29 17:00:28.560632989 [E:onnxruntime:, inference_session.cc:1985 Initialize] Encountered unknown exception in Initialize() [03/29/24 17:00:28] ERROR Exception in ASGI application

                         ╭─────── Traceback (most recent call last) ───────╮
                         │ /usr/src/app/main.py:116 in predict             │
                         │                                                 │
                         │   113 │   except orjson.JSONDecodeError:        │
                         │   114 │   │   raise HTTPException(400, f"Invali │
                         │   115 │                                         │
                         │ ❱ 116 │   model = await load(await model_cache. │
                         │       ttl=settings.model_ttl, **kwargs))        │
                         │   117 │   model.configure(**kwargs)             │
                         │   118 │   outputs = await run(model.predict, in │
                         │   119 │   return ORJSONResponse(outputs)        │
                         │                                                 │
                         │ /usr/src/app/main.py:137 in load                │
                         │                                                 │
                         │   134 │   │   │   model.load()                  │
                         │   135 │                                         │
                         │   136 │   try:                                  │
                         │ ❱ 137 │   │   await run(_load, model)           │
                         │   138 │   │   return model                      │
                         │   139 │   except (OSError, InvalidProtobuf, Bad │
                         │   140 │   │   log.warning(                      │
                         │                                                 │
                         │ /usr/src/app/main.py:125 in run                 │
                         │                                                 │
                         │   122 async def run(func: Callable[..., Any], i │
                         │   123 │   if thread_pool is None:               │
                         │   124 │   │   return func(inputs)               │
                         │ ❱ 125 │   return await asyncio.get_running_loop │
                         │   126                                           │
                         │   127                                           │
                         │   128 async def load(model: InferenceModel) ->  │
                         │                                                 │
                         │ /usr/lib/python3.10/concurrent/futures/thread.p │
                         │ y:58 in run                                     │
                         │                                                 │
                         │ /usr/src/app/main.py:134 in _load               │
                         │                                                 │
                         │   131 │                                         │
                         │   132 │   def _load(model: InferenceModel) -> N │
                         │   133 │   │   with lock:                        │
                         │ ❱ 134 │   │   │   model.load()                  │
                         │   135 │                                         │
                         │   136 │   try:                                  │
                         │   137 │   │   await run(_load, model)           │
                         │                                                 │
                         │ /usr/src/app/models/base.py:52 in load          │
                         │                                                 │
                         │    49 │   │   │   return                        │
                         │    50 │   │   self.download()                   │
                         │    51 │   │   log.info(f"Loading {self.model_ty │
                         │       to memory")                               │
                         │ ❱  52 │   │   self._load()                      │
                         │    53 │   │   self.loaded = True                │
                         │    54 │                                         │
                         │    55 │   def predict(self, inputs: Any, **mode │
                         │                                                 │
                         │ /usr/src/app/models/clip.py:146 in _load        │
                         │                                                 │
                         │   143 │   │   super().__init__(clean_name(model │
                         │   144 │                                         │
                         │   145 │   def _load(self) -> None:              │
                         │ ❱ 146 │   │   super()._load()                   │
                         │   147 │   │   self._load_tokenizer()            │
                         │   148 │   │                                     │
                         │   149 │   │   size: list[int] | int = self.prep │
                         │                                                 │
                         │ /usr/src/app/models/clip.py:36 in _load         │
                         │                                                 │
                         │    33 │   def _load(self) -> None:              │
                         │    34 │   │   if self.mode == "text" or self.mo │
                         │    35 │   │   │   log.debug(f"Loading clip text │
                         │ ❱  36 │   │   │   self.text_model = self._make_ │
                         │    37 │   │   │   log.debug(f"Loaded clip text  │
                         │    38 │   │                                     │
                         │    39 │   │   if self.mode == "vision" or self. │
                         │                                                 │
                         │ /usr/src/app/models/base.py:117 in              │
                         │ _make_session                                   │
                         │                                                 │
                         │   114 │   │   │   case ".armnn":                │
                         │   115 │   │   │   │   session = AnnSession(mode │
                         │   116 │   │   │   case ".onnx":                 │
                         │ ❱ 117 │   │   │   │   session = ort.InferenceSe │
                         │   118 │   │   │   │   │   model_path.as_posix() │
                         │   119 │   │   │   │   │   sess_options=self.ses │
                         │   120 │   │   │   │   │   providers=self.provid │
                         │                                                 │
                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │
                         │ ime/capi/onnxruntime_inference_collection.py:41 │
                         │ 9 in __init__                                   │
                         │                                                 │
                         │    416 │   │   disabled_optimizers = kwargs["di │
                         │        kwargs else None                         │
                         │    417 │   │                                    │
                         │    418 │   │   try:                             │
                         │ ❱  419 │   │   │   self._create_inference_sessi │
                         │        disabled_optimizers)                     │
                         │    420 │   │   except (ValueError, RuntimeError │
                         │    421 │   │   │   if self._enable_fallback:    │
                         │    422 │   │   │   │   try:                     │
                         │                                                 │
                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │
                         │ ime/capi/onnxruntime_inference_collection.py:48 │
                         │ 3 in _create_inference_session                  │
                         │                                                 │
                         │    480 │   │   │   disabled_optimizers = set(di │
                         │    481 │   │                                    │
                         │    482 │   │   # initialize the C++ InferenceSe │
                         │ ❱  483 │   │   sess.initialize_session(provider │
                         │    484 │   │                                    │
                         │    485 │   │   self._sess = sess                │
                         │    486 │   │   self._sess_options = self._sess. │
                         ╰─────────────────────────────────────────────────╯
                         RuntimeException: [ONNXRuntimeError] : 6 :         
                         RUNTIME_EXCEPTION : Encountered unknown exception  
                         in Initialize()                                    `
mertalev commented 6 months ago

1.99 updated to a newer version of OpenVINO. This actually fixed smart search for most users from what I've seen, so interesting that it broke it for you. Unfortunately, there isn't much I can do here. It's surprisingly difficult to make OpenVINO work for everyone.

reef-actor commented 5 months ago

I am seeing this issue on my humble Intel J5005 (Gemini Lake). This is a fresh install as of yesterday evening

I have over 10GB unallocated gunicorn appears in intel_gpu_top but no usage

OpenVino inference is working nicely for Frigate NVR, but perhaps this model is too heavy for the iGPU? I'm happy to wipe/experiment with my installation if I can help in any way.

The log below is for a single face detection attempt (concurrency = 1)

immich_machine_learning logs [04/23/24 10:13:26] INFO Starting gunicorn 22.0.0 [04/23/24 10:13:26] INFO Listening at: http://[::]:3003 (9) [04/23/24 10:13:26] INFO Using worker: app.config.CustomUvicornWorker [04/23/24 10:13:26] INFO Booting worker with pid: 13 [04/23/24 10:13:27] DEBUG Could not load ANN shared libraries, using ONNX: libmali.so: cannot open shared object file: No such file or directory [04/23/24 10:13:34] INFO Started server process [13] [04/23/24 10:13:34] INFO Waiting for application startup. [04/23/24 10:13:34] INFO Created in-memory cache with unloading after 300s of inactivity. [04/23/24 10:13:34] INFO Initialized request thread pool with 4 threads. [04/23/24 10:13:34] DEBUG Checking for inactivity... [04/23/24 10:13:34] INFO Application startup complete. [04/23/24 10:13:44] DEBUG Checking for inactivity... [04/23/24 10:13:52] DEBUG Available ORT providers: {'CPUExecutionProvider', 'OpenVINOExecutionProvider'} [04/23/24 10:13:52] DEBUG Available OpenVINO devices: ['CPU', 'GPU'] [04/23/24 10:13:52] INFO Setting 'buffalo_l' execution providers to ['OpenVINOExecutionProvider', 'CPUExecutionProvider'], in descending order of preference [04/23/24 10:13:52] DEBUG Setting execution provider options to [{'device_type': 'GPU_FP32', 'cache_dir': '/cache/facial-recognition/buffalo_l/openvino'}, {'arena_extend_strategy': 'kSameAsRequested'}] [04/23/24 10:13:52] DEBUG Setting execution_mode to ORT_SEQUENTIAL [04/23/24 10:13:52] DEBUG Setting inter_op_num_threads to 0 [04/23/24 10:13:52] DEBUG Setting intra_op_num_threads to 0 [04/23/24 10:13:52] DEBUG Setting preferred runtime to onnx [04/23/24 10:13:52] INFO Loading facial recognition model 'buffalo_l' to memory 2024-04-23 10:13:53.433384806 [E:onnxruntime:, inference_session.cc:1985 Initialize] Encountered unknown exception in Initialize() [04/23/24 10:13:53] ERROR Exception in ASGI application ╭─────── Traceback (most recent call last) ───────╮ │ /usr/src/app/main.py:116 in predict │ │ │ │ 113 │ except orjson.JSONDecodeError: │ │ 114 │ │ raise HTTPException(400, f"Invali │ │ 115 │ │ │ ❱ 116 │ model = await load(await model_cache. │ │ ttl=settings.model_ttl, **kwargs)) │ │ 117 │ model.configure(**kwargs) │ │ 118 │ outputs = await run(model.predict, in │ │ 119 │ return ORJSONResponse(outputs) │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ image = UploadFile(filename='blob', │ │ │ │ size=491485, │ │ │ │ headers=Headers({'content-dis… │ │ │ │ 'form-data; name="image"; │ │ │ │ filename="blob"', │ │ │ │ 'content-type': │ │ │ │ 'application/octet-stream'})) │ │ │ │ inputs = b'\xff\xd8\xff\xe2\x01\xf0ICC… │ │ │ │ \x00\x00mntrRGB XYZ │ │ │ │ \x07\xe2\x00\x03\x00\x14\x00\… │ │ │ │ kwargs = { │ │ │ │ │ 'minScore': 0.7, │ │ │ │ │ 'maxDistance': 0.5, │ │ │ │ │ 'minFaces': 3 │ │ │ │ } │ │ │ │ model_name = 'buffalo_l' │ │ │ │ model_type = │ │ │ │ options = '{"minScore":0.7,"maxDistance… │ │ │ │ text = None │ │ │ ╰─────────────────────────────────────────────╯ │ │ │ │ /usr/src/app/main.py:137 in load │ │ │ │ 134 │ │ │ model.load() │ │ 135 │ │ │ 136 │ try: │ │ ❱ 137 │ │ await run(_load, model) │ │ 138 │ │ return model │ │ 139 │ except (OSError, InvalidProtobuf, Bad │ │ 140 │ │ log.warning( │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ _load = ._load at │ │ │ │ 0x7f86807c0af0> │ │ │ │ model = │ │ │ ╰─────────────────────────────────────────────╯ │ │ │ │ /usr/src/app/main.py:125 in run │ │ │ │ 122 async def run(func: Callable[..., Any], i │ │ 123 │ if thread_pool is None: │ │ 124 │ │ return func(inputs) │ │ ❱ 125 │ return await asyncio.get_running_loop │ │ 126 │ │ 127 │ │ 128 async def load(model: InferenceModel) -> │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ func = ._load at │ │ │ │ 0x7f86807c0af0> │ │ │ │ inputs = │ │ │ ╰─────────────────────────────────────────────╯ │ │ │ │ /usr/lib/python3.10/concurrent/futures/thread.p │ │ y:58 in run │ │ │ │ /usr/src/app/main.py:134 in _load │ │ │ │ 131 │ │ │ 132 │ def _load(model: InferenceModel) -> N │ │ 133 │ │ with lock: │ │ ❱ 134 │ │ │ model.load() │ │ 135 │ │ │ 136 │ try: │ │ 137 │ │ await run(_load, model) │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ model = │ │ │ ╰─────────────────────────────────────────────╯ │ │ │ │ /usr/src/app/models/base.py:52 in load │ │ │ │ 49 │ │ │ return │ │ 50 │ │ self.download() │ │ 51 │ │ log.info(f"Loading {self.model_ty │ │ to memory") │ │ ❱ 52 │ │ self._load() │ │ 53 │ │ self.loaded = True │ │ 54 │ │ │ 55 │ def predict(self, inputs: Any, **mode │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ self = │ │ │ ╰─────────────────────────────────────────────╯ │ │ │ │ /usr/src/app/models/facial_recognition.py:30 in │ │ _load │ │ │ │ 27 │ │ super().__init__(clean_name(model_ │ │ 28 │ │ │ 29 │ def _load(self) -> None: │ │ ❱ 30 │ │ self.det_model = RetinaFace(sessio │ │ 31 │ │ self.rec_model = ArcFaceONNX( │ │ 32 │ │ │ self.rec_file.with_suffix(".on │ │ 33 │ │ │ session=self._make_session(sel │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ self = │ │ │ ╰─────────────────────────────────────────────╯ │ │ │ │ /usr/src/app/models/base.py:117 in │ │ _make_session │ │ │ │ 114 │ │ │ case ".armnn": │ │ 115 │ │ │ │ session = AnnSession(mode │ │ 116 │ │ │ case ".onnx": │ │ ❱ 117 │ │ │ │ session = ort.InferenceSe │ │ 118 │ │ │ │ │ model_path.as_posix() │ │ 119 │ │ │ │ │ sess_options=self.ses │ │ 120 │ │ │ │ │ providers=self.provid │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ model_path = PosixPath('/cache/facial-reco… │ │ │ │ self = │ │ │ ╰─────────────────────────────────────────────╯ │ │ │ │ /opt/venv/lib/python3.10/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:41 │ │ 9 in __init__ │ │ │ │ 416 │ │ disabled_optimizers = kwargs["di │ │ kwargs else None │ │ 417 │ │ │ │ 418 │ │ try: │ │ ❱ 419 │ │ │ self._create_inference_sessi │ │ disabled_optimizers) │ │ 420 │ │ except (ValueError, RuntimeError │ │ 421 │ │ │ if self._enable_fallback: │ │ 422 │ │ │ │ try: │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ disabled_optimizers = None │ │ │ │ kwargs = {} │ │ │ │ path_or_bytes = '/cache/facial-recog… │ │ │ │ provider_options = [ │ │ │ │ │ { │ │ │ │ │ │ │ │ │ │ 'device_type': │ │ │ │ 'GPU_FP32', │ │ │ │ │ │ 'cache_dir': │ │ │ │ '/cache/facial-recog… │ │ │ │ │ }, │ │ │ │ │ { │ │ │ │ │ │ │ │ │ │ 'arena_extend_strate… │ │ │ │ 'kSameAsRequested' │ │ │ │ │ } │ │ │ │ ] │ │ │ │ providers = [ │ │ │ │ │ │ │ │ │ 'OpenVINOExecutionPr… │ │ │ │ │ │ │ │ │ 'CPUExecutionProvide… │ │ │ │ ] │ │ │ │ self = │ │ │ │ sess_options = │ │ │ ╰─────────────────────────────────────────────╯ │ │ │ │ /opt/venv/lib/python3.10/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:48 │ │ 3 in _create_inference_session │ │ │ │ 480 │ │ │ disabled_optimizers = set(di │ │ 481 │ │ │ │ 482 │ │ # initialize the C++ InferenceSe │ │ ❱ 483 │ │ sess.initialize_session(provider │ │ 484 │ │ │ │ 485 │ │ self._sess = sess │ │ 486 │ │ self._sess_options = self._sess. │ │ │ │ ╭────────────────── locals ───────────────────╮ │ │ │ available_providers = [ │ │ │ │ │ │ │ │ │ 'OpenVINOExecutionPr… │ │ │ │ │ │ │ │ │ 'CPUExecutionProvide… │ │ │ │ ] │ │ │ │ disabled_optimizers = set() │ │ │ │ provider_options = [ │ │ │ │ │ { │ │ │ │ │ │ │ │ │ │ 'device_type': │ │ │ │ 'GPU_FP32', │ │ │ │ │ │ 'cache_dir': │ │ │ │ '/cache/facial-recog… │ │ │ │ │ }, │ │ │ │ │ { │ │ │ │ │ │ │ │ │ │ 'arena_extend_strate… │ │ │ │ 'kSameAsRequested' │ │ │ │ │ } │ │ │ │ ] │ │ │ │ providers = [ │ │ │ │ │ │ │ │ │ 'OpenVINOExecutionPr… │ │ │ │ │ │ │ │ │ 'CPUExecutionProvide… │ │ │ │ ] │ │ │ │ self = │ │ │ │ sess = │ │ │ │ session_options = │ │ │ ╰─────────────────────────────────────────────╯ │ ╰─────────────────────────────────────────────────╯ RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Encountered unknown exception in Initialize()
omltcat commented 5 months ago

I am on N5105 as well, with 16GB RAM. Can confirm having the exact same error when running 1.99.0 above. If switching to immich-machine-learning:v1.98.2-openvino everything works.

With MACHINE_LEARNING_PRELOAD__CLIP=ViT-B-32__openai, the error shows immediately after start:

immich-machine-learning logs ```python [04/24/24 20:57:02] INFO Loading clip model 'ViT-B-32__openai' to memory 2024-04-24 20:57:03.939642959 [E:onnxruntime:, inference_session.cc:1985 Initialize] Encountered unknown exception in Initialize() [04/24/24 20:57:04] ERROR Traceback (most recent call last): File "/opt/venv/lib/python3.10/site-packages/starlette/r outing.py", line 734, in lifespan async with self.lifespan_context(app) as maybe_state: File "/usr/lib/python3.10/contextlib.py", line 199, in __aenter__ return await anext(self.gen) File "/usr/src/app/main.py", line 55, in lifespan await preload_models(settings.preload) File "/usr/src/app/main.py", line 69, in preload_models await load(await model_cache.get(preload_models.clip, ModelType.CLIP)) File "/usr/src/app/main.py", line 137, in load await run(_load, model) File "/usr/src/app/main.py", line 125, in run return await asyncio.get_running_loop().run_in_executor(thread_p ool, func, inputs) File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) File "/usr/src/app/main.py", line 134, in _load model.load() File "/usr/src/app/models/base.py", line 52, in load self._load() File "/usr/src/app/models/clip.py", line 146, in _load super()._load() File "/usr/src/app/models/clip.py", line 36, in _load self.text_model = self._make_session(self.textual_path) File "/usr/src/app/models/base.py", line 117, in _make_session session = ort.InferenceSession( File "/opt/venv/lib/python3.10/site-packages/onnxruntime /capi/onnxruntime_inference_collection.py", line 419, in __init__ self._create_inference_session(providers, provider_options, disabled_optimizers) File "/opt/venv/lib/python3.10/site-packages/onnxruntime /capi/onnxruntime_inference_collection.py", line 483, in _create_inference_session sess.initialize_session(providers, provider_options, disabled_optimizers) onnxruntime.capi.onnxruntime_pybind11_state.Runtime Exception: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Encountered unknown exception in Initialize() ```
stefano99 commented 3 months ago

I'm getting the same error on a i5 6200u (gpu hd520), with Immich version 1.104.6, and since V1.100 (the first time I tried using the gpu for hw acceleration) Transcoding works fine.

It's running in a privileged LXC on the latest version of proxmox, 8gb of ram dedicated.

I'm not able, though, to try the 1.98.2 version of the machine learning container because it gives me an error on the server container:

server container error with 1.98.2 machine learning container [Nest] 7 - 06/14/2024, 11:32:55 AM ERROR [Microservices:JobService] Unable to run job handler (faceDetection/face-detection): Error: Machine learning request '{"facial-recognition":{"detection":{"modelName":"buffalo_l","options":{"minScore":0.7}},"recognition":{"modelName":"buffalo_l"}}}' failed with status 422: Unprocessable Entity [Nest] 7 - 06/14/2024, 11:32:55 AM ERROR [Microservices:JobService] Error: Machine learning request '{"facial-recognition":{"detection":{"modelName":"buffalo_l","options":{"minScore":0.7}},"recognition":{"modelName":"buffalo_l"}}}' failed with status 422: Unprocessable Entity at MachineLearningRepository.predict (/usr/src/app/dist/repositories/machine-learning.repository.js:22:19) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async MachineLearningRepository.detectFaces (/usr/src/app/dist/repositories/machine-learning.repository.js:33:26) at async PersonService.handleDetectFaces (/usr/src/app/dist/services/person.service.js:274:52) at async /usr/src/app/dist/services/job.service.js:148:36 at async Worker.processJob (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:394:28) at async Worker.retryIfFailed (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:581:24) [Nest] 7 - 06/14/2024, 11:32:55 AM ERROR [Microservices:JobService] Object: { "id": "2c44089e-a5ab-4b47-894e-21ad138371b0" } [Nest] 17 - 06/14/2024, 11:33:53 AM LOG [Api:EventRepository] Websocket Disconnect: yaBQqD1_Clm1qZRGAAAJ
mertalev commented 2 months ago

This should be fixed as of the current release. Be sure to delete the model cache volume so it downloads the updated models.

fwmone commented 2 months ago

Looks good to me - thanks a bunch!