Visual Model for Smart Search Job fails to load on the v1.117.0 tag.

imohitkr commented 2 weeks ago

The bug

Hi,

Thank you for the update, but it seems that the smart search job is broken.

I am running the OpenVINO image, and everything works fine with the image built with the tag "ghcr.io/immich-app/immich-machine-learning:pr-11453-openvino". However, with the latest image tagged "ghcr.io/immich-app/immich-machine-learning:v1.117.0-openvino", I am getting the errors listed below. When I pair the latest version with the old machine learning server image, everything works fine.

Interestingly the "text" based search works ,where it loads the textual model , with the latest machine-learning-server image.

Please help.

The OS that Immich Server is running on

Debian GNU/Linux 12 (bookworm)

Version of Immich Server

v1.117.0

Version of Immich Mobile App

v1.117.0

Platform with the issue

[x] Server
[ ] Web
[ ] Mobile

Your docker-compose.yml content

#
# WARNING: Make sure to use the docker-compose.yml of the current release:
#
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
#
# The compose file on main may not be compatible with the latest release.
#

name: immich

services:
  immich-server:
    ulimits:
      nofile:
        soft: 200000
        hard: 200000
    container_name: immich_server

    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    # extends:
    #   file: hwaccel.transcoding.yml
    #   service: cpu # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding
    devices:
      - /dev/dri:/dev/dri
    # user: 999:10
    # security_opt:
    #   - no-new-privileges:true
    volumes:
      # Do not edit the next line. If you want to change the media storage location on your system, edit the value of UPLOAD_LOCATION in the .env file
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - stack.env
    ports:
      - 2283:3001
    depends_on:
      - redis
      - database
    restart: always

  immich-machine-learning:
    container_name: immich_machine_learning
    # user: 999:10
    # security_opt:
    #   - no-new-privileges:true
    # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    # image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-openvino
    image: ghcr.io/immich-app/immich-machine-learning:pr-11453-openvino
    # extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
    #   file: hwaccel.ml.yml
    #   service: cpu # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    device_cgroup_rules:
      - 'c 189:* rmw'
    devices:
      - /dev/dri:/dev/dri
    volumes:
      - /dev/bus/usb:/dev/bus/usb
      - /volume2/docker/immich/model_cache:/cache
    env_file:
      - stack.env
    restart: always

  redis:
    container_name: immich_redis
    image: docker.io/redis:6.2-alpine
    environment:
      - TZ=Europe/Berlin
    security_opt:
      - no-new-privileges:true
    volumes:
      - /volume2/docker/immich/redis:/data:rw
    healthcheck:
      test: redis-cli ping || exit 1
    restart: always

  database:
    container_name: immich_postgres
    image: docker.io/tensorchord/pgvecto-rs:pg14-v0.2.0

    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
      POSTGRES_INITDB_ARGS: '--data-checksums'
      TZ: Europe/Berlin
    user: 999:10
    volumes:
      # Do not edit the next line. If you want to change the database storage location on your system, edit the value of DB_DATA_LOCATION in the .env file
      - ${DB_DATA_LOCATION}:/var/lib/postgresql/data
    healthcheck:
      test: pg_isready --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' || exit 1; Chksum="$$(psql --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' --tuples-only --no-align --command='SELECT COALESCE(SUM(checksum_failures), 0) FROM pg_stat_database')"; echo "checksum failure count is $$Chksum"; [ "$$Chksum" = '0' ] || exit 1
      interval: 5m
      start_interval: 30s
      start_period: 5m
    command: ["postgres", "-c" ,"shared_preload_libraries=vectors.so", "-c", 'search_path="$$user", public, vectors', "-c", "logging_collector=on", "-c", "max_wal_size=2GB", "-c", "shared_buffers=512MB", "-c", "wal_compression=on"]
    restart: always

Your .env content

UPLOAD_LOCATION=/volume2/docker/immich/upload
DB_DATA_LOCATION=/volume2/docker/immich/db
IMMICH_VERSION=v1.117.0
DB_PASSWORD=<>
DB_USERNAME=<>
DB_DATABASE_NAME=immich
MACHINE_LEARNING_WORKERS=3

Reproduction steps

Load the latest images in a intel based system with openvino support.
Run a smart search job.

Relevant log output

[10/06/24 15:11:46] INFO     Attempt #2 to load visual model                    

                             'ViT-H-14-378-quickgelu__dfn5b' to memory          

[10/06/24 15:11:46] INFO     Setting execution providers to                     

                             ['OpenVINOExecutionProvider',                      

                             'CPUExecutionProvider'], in descending order of    

                             preference                                         

2024-10-06 15:11:46.370454587 [E:onnxruntime:, inference_session.cc:2105 operator()] Exception during initialization: /onnxruntime/onnxruntime/core/providers/openvino/backend_manager.cc:126 onnxruntime::openvino_ep::BackendManager::BackendManager(const onnxruntime::openvino_ep::GlobalContext&, const onnxruntime::Node&, const onnxruntime::GraphViewer&, const onnxruntime::logging::Logger&, onnxruntime::openvino_ep::EPCtxHandler&) /onnxruntime/onnxruntime/core/providers/openvino/ov_interface.cc:106 onnxruntime::openvino_ep::OVExeNetwork onnxruntime::openvino_ep::OVCore::CompileModel(const std::string&, std::string&, ov::AnyMap&, const std::string&) [OpenVINO-EP]  Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_2_0Exception from src/inference/src/cpp/core.cpp:142:

invalid external data: ExternalDataInfo(data_full_path: 51cc7752-47e9-11ef-91e8-00155d655292, offset: 9564160, data_length: 3737600)

[10/06/24 15:11:46] ERROR    Exception in ASGI application                      

                             ╭─────── Traceback (most recent call last) ───────╮

                             │ /usr/src/app/main.py:152 in predict             │

                             │                                                 │

                             │   149 │   │   inputs = text                     │

                             │   150 │   else:                                 │

                             │   151 │   │   raise HTTPException(400, "Either  │

                             │ ❱ 152 │   response = await run_inference(inputs │

                             │   153 │   return ORJSONResponse(response)       │

                             │   154                                           │

                             │   155                                           │

                             │                                                 │

                             │ /usr/src/app/main.py:175 in run_inference       │

                             │                                                 │

                             │   172 │   │   response[entry["task"]] = output  │

                             │   173 │                                         │

                             │   174 │   without_deps, with_deps = entries     │

                             │ ❱ 175 │   await asyncio.gather(*[_run_inference │

                             │   176 │   if with_deps:                         │

                             │   177 │   │   await asyncio.gather(*[_run_infer │

                             │   178 │   if isinstance(payload, Image):        │

                             │                                                 │

                             │ /usr/src/app/main.py:169 in _run_inference      │

                             │                                                 │

                             │   166 │   │   │   except KeyError:              │

                             │   167 │   │   │   │   message = f"Task {entry[' │

                             │       output of {dep}"                          │

                             │   168 │   │   │   │   raise HTTPException(400,  │

                             │ ❱ 169 │   │   model = await load(model)         │

                             │   170 │   │   output = await run(model.predict, │

                             │   171 │   │   outputs[model.identity] = output  │

                             │   172 │   │   response[entry["task"]] = output  │

                             │                                                 │

                             │ /usr/src/app/main.py:213 in load                │

                             │                                                 │

                             │   210 │   │   return model                      │

                             │   211 │                                         │

                             │   212 │   try:                                  │

                             │ ❱ 213 │   │   return await run(_load, model)    │

                             │   214 │   except (OSError, InvalidProtobuf, Bad │

                             │   215 │   │   log.warning(f"Failed to load {mod │

                             │       '{model.model_name}'. Clearing cache.")   │

                             │   216 │   │   model.clear_cache()               │

                             │                                                 │

                             │ /usr/src/app/main.py:188 in run                 │

                             │                                                 │

                             │   185 │   if thread_pool is None:               │

                             │   186 │   │   return func(*args, **kwargs)      │

                             │   187 │   partial_func = partial(func, *args, * │

                             │ ❱ 188 │   return await asyncio.get_running_loop │

                             │   189                                           │

                             │   190                                           │

                             │   191 async def load(model: InferenceModel) ->  │

                             │                                                 │

                             │ /usr/local/lib/python3.11/concurrent/futures/th │

                             │ read.py:58 in run                               │

                             │                                                 │

                             │ /usr/src/app/main.py:200 in _load               │

                             │                                                 │

                             │   197 │   │   │   raise HTTPException(500, f"Fa │

                             │   198 │   │   with lock:                        │

                             │   199 │   │   │   try:                          │

                             │ ❱ 200 │   │   │   │   model.load()              │

                             │   201 │   │   │   except FileNotFoundError as e │

                             │   202 │   │   │   │   if model.model_format ==  │

                             │   203 │   │   │   │   │   raise e               │

                             │                                                 │

                             │ /usr/src/app/models/base.py:53 in load          │

                             │                                                 │

                             │    50 │   │   self.download()                   │

                             │    51 │   │   attempt = f"Attempt #{self.load_a │

                             │       else "Loading"                            │

                             │    52 │   │   log.info(f"{attempt} {self.model_ │

                             │       '{self.model_name}' to memory")           │

                             │ ❱  53 │   │   self.session = self._load()       │

                             │    54 │   │   self.loaded = True                │

                             │    55 │                                         │

                             │    56 │   def predict(self, *inputs: Any, **mod │

                             │                                                 │

                             │ /usr/src/app/models/clip/visual.py:62 in _load  │

                             │                                                 │

                             │   59 │   │   self.mean = np.array(self.preproce │

                             │   60 │   │   self.std = np.array(self.preproces │

                             │   61 │   │                                      │

                             │ ❱ 62 │   │   return super()._load()             │

                             │   63 │                                          │

                             │   64 │   def transform(self, image: Image.Image │

                             │   65 │   │   image = resize_pil(image, self.siz │

                             │                                                 │

                             │ /usr/src/app/models/base.py:78 in _load         │

                             │                                                 │

                             │    75 │   │   )                                 │

                             │    76 │                                         │

                             │    77 │   def _load(self) -> ModelSession:      │

                             │ ❱  78 │   │   return self._make_session(self.mo │

                             │    79 │                                         │

                             │    80 │   def clear_cache(self) -> None:        │

                             │    81 │   │   if not self.cache_dir.exists():   │

                             │                                                 │

                             │ /usr/src/app/models/base.py:110 in              │

                             │ _make_session                                   │

                             │                                                 │

                             │   107 │   │   │   case ".armnn":                │

                             │   108 │   │   │   │   session: ModelSession = A │

                             │   109 │   │   │   case ".onnx":                 │

                             │ ❱ 110 │   │   │   │   session = OrtSession(mode │

                             │   111 │   │   │   case _:                       │

                             │   112 │   │   │   │   raise ValueError(f"Unsupp │

                             │   113 │   │   return session                    │

                             │                                                 │

                             │ /usr/src/app/sessions/ort.py:28 in __init__     │

                             │                                                 │

                             │    25 │   │   self.providers = providers if pro │

                             │    26 │   │   self.provider_options = provider_ │

                             │       self._provider_options_default            │

                             │    27 │   │   self.sess_options = sess_options  │

                             │       self._sess_options_default                │

                             │ ❱  28 │   │   self.session = ort.InferenceSessi │

                             │    29 │   │   │   self.model_path.as_posix(),   │

                             │    30 │   │   │   providers=self.providers,     │

                             │    31 │   │   │   provider_options=self.provide │

                             │                                                 │

                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │

                             │ ime/capi/onnxruntime_inference_collection.py:41 │

                             │ 9 in __init__                                   │

                             │                                                 │

                             │    416 │   │   disabled_optimizers = kwargs.get │

                             │    417 │   │                                    │

                             │    418 │   │   try:                             │

                             │ ❱  419 │   │   │   self._create_inference_sessi │

                             │        disabled_optimizers)                     │

                             │    420 │   │   except (ValueError, RuntimeError │

                             │    421 │   │   │   if self._enable_fallback:    │

                             │    422 │   │   │   │   try:                     │

                             │                                                 │

                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │

                             │ ime/capi/onnxruntime_inference_collection.py:49 │

                             │ 1 in _create_inference_session                  │

                             │                                                 │

                             │    488 │   │   │   disabled_optimizers = set(di │

                             │    489 │   │                                    │

                             │    490 │   │   # initialize the C++ InferenceSe │

                             │ ❱  491 │   │   sess.initialize_session(provider │

                             │    492 │   │                                    │

                             │    493 │   │   self._sess = sess                │

                             │    494 │   │   self._sess_options = self._sess. │

                             ╰─────────────────────────────────────────────────╯

                             RuntimeException: [ONNXRuntimeError] : 6 :         

                             RUNTIME_EXCEPTION : Exception during               

                             initialization:                                    

                             /onnxruntime/onnxruntime/core/providers/openvino/ba

                             ckend_manager.cc:126                               

                             onnxruntime::openvino_ep::BackendManager::BackendMa

                             nager(const                                        

                             onnxruntime::openvino_ep::GlobalContext&, const    

                             onnxruntime::Node&, const                          

                             onnxruntime::GraphViewer&, const                   

                             onnxruntime::logging::Logger&,                     

                             onnxruntime::openvino_ep::EPCtxHandler&)           

                             /onnxruntime/onnxruntime/core/providers/openvino/ov

                             _interface.cc:106                                  

                             onnxruntime::openvino_ep::OVExeNetwork             

                             onnxruntime::openvino_ep::OVCore::CompileModel(cons

                             t std::string&, std::string&, ov::AnyMap&, const   

                             std::string&) [OpenVINO-EP]  Exception while       

                             Loading Network for graph:                         

                             OpenVINOExecutionProvider_OpenVINO-EP-subgraph_2_0E

                             xception from src/inference/src/cpp/core.cpp:142:  

                             invalid external data:                             

                             ExternalDataInfo(data_full_path:                   

                             51cc7752-47e9-11ef-91e8-00155d655292, offset:      

                             9564160, data_length: 3737600)

Additional information

lscpu output :

Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 6 On-line CPU(s) list: 0-5 Vendor ID: GenuineIntel Model name: Intel(R) Pentium(R) Gold 8505 CPU family: 6 Model: 154 Thread(s) per core: 2 Core(s) per socket: 5 Socket(s): 1 Stepping: 4 CPU(s) scaling MHz: 75% CPU max MHz: 4400.0000 CPU min MHz: 400.0000 BogoMIPS: 4992.00 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe po pcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tprshadow vnmi flexpriority ept vpid ept ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb inte l_pt sha_ni xsaveopt xsavec xgetbv1 xsaves split_lock_detect avx_vnni dtherm ida arat pln pt s hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req hfi umip pku ospke waitpkg gfni vaes vpc lmulqdq rdpid movdiri movdir64b fsrm md_clear serialize arch_lbr ibt flush_l1d arch_capabili ties Virtualization features: Virtualization: VT-x Caches (sum of all):
L1d: 176 KiB (5 instances) L1i: 288 KiB (5 instances) L2: 3.3 MiB (2 instances) L3: 8 MiB (1 instance) NUMA:
NUMA node(s): 1 NUMA node0 CPU(s): 0-5 Vulnerabilities:
Itlb multihit: Not affected L1tf: Not affected Mds: Not affected Meltdown: Not affected Mmio stale data: Not affected Retbleed: Not affected Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence Srbds: Not affected Tsx async abort: Not affected

bo0tzz commented 2 weeks ago

Are you certain that's the image you've been using? Because it's from a PR that doesn't have any ML changes in it.

imohitkr commented 2 weeks ago

Are you certain that's the image you've been using? Because it's from a PR that doesn't have any ML changes in it.

Yes, one of the releases failed to generate the machine learning open vino image tag. As a result, I had to use one of the pr tagged versions instead of waiting for a re-release. Since there were no changes between the last release and this PR tagged version, I decided to go with it.

mertalev commented 2 weeks ago

Can you try with MACHINE_LEARNING_WORKERS=3 commented out?

imohitkr commented 2 weeks ago

Can you try with MACHINE_LEARNING_WORKERS=3 commented out?

Same issue when I comment it out.

Saschl commented 2 weeks ago

I am seeing the same issue. Cleared the model cache just to be sure, but no change. The normal (CPU) image runs just fine 👍 Funny enough I have the same CPU as the OP (Intel(R) Pentium(R) Gold 8505, Ugreen NAS).

With the tag v1.116.2-openvino it also works again.

[10/06/24 23:10:02] INFO     Setting execution providers to                     
                             ['OpenVINOExecutionProvider',                      
                             'CPUExecutionProvider'], in descending order of    
                             preference                                         
2024-10-06 23:10:02.909341676 [E:onnxruntime:, inference_session.cc:2105 operator()] Exception during initialization: /onnxruntime/onnxruntime/core/providers/openvino/backend_manager.cc:126 onnxruntime::openvino_ep::BackendManager::BackendManager(const onnxruntime::openvino_ep::GlobalContext&, const onnxruntime::Node&, const onnxruntime::GraphViewer&, const onnxruntime::logging::Logger&, onnxruntime::openvino_ep::EPCtxHandler&) /onnxruntime/onnxruntime/core/providers/openvino/ov_interface.cc:106 onnxruntime::openvino_ep::OVExeNetwork onnxruntime::openvino_ep::OVCore::CompileModel(const std::string&, std::string&, ov::AnyMap&, const std::string&) [OpenVINO-EP]  Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_2_0Exception from src/inference/src/cpp/core.cpp:142:
invalid external data: ExternalDataInfo(data_full_path: 51cc7752-47e9-11ef-91e8-00155d655292, offset: 9564160, data_length: 3737600)
[10/06/24 23:10:02] ERROR    Exception in ASGI application                      

                             ...same stack trace as abve
                             RuntimeException: [ONNXRuntimeError] : 6 :         
                             RUNTIME_EXCEPTION : Exception during               
                             initialization:                                    
                             /onnxruntime/onnxruntime/core/providers/openvino/ba
                             ckend_manager.cc:126                               
                             onnxruntime::openvino_ep::BackendManager::BackendMa
                             nager(const                                        
                             onnxruntime::openvino_ep::GlobalContext&, const    
                             onnxruntime::Node&, const                          
                             onnxruntime::GraphViewer&, const                   
                             onnxruntime::logging::Logger&,                     
                             onnxruntime::openvino_ep::EPCtxHandler&)           
                             /onnxruntime/onnxruntime/core/providers/openvino/ov
                             _interface.cc:106                                  
                             onnxruntime::openvino_ep::OVExeNetwork             
                             onnxruntime::openvino_ep::OVCore::CompileModel(cons
                             t std::string&, std::string&, ov::AnyMap&, const   
                             std::string&) [OpenVINO-EP]  Exception while       
                             Loading Network for graph:                         
                             OpenVINOExecutionProvider_OpenVINO-EP-subgraph_2_0E
                             xception from src/inference/src/cpp/core.cpp:142:  
                             invalid external data:                             
                             ExternalDataInfo(data_full_path:                   
                             51cc7752-47e9-11ef-91e8-00155d655292, offset:      
                             9564160, data_length: 3737600)

imohitkr commented 2 weeks ago

I narrowed it down to this pr : https://github.com/immich-app/immich/pull/12883 , the image built with its tag is where the problem started.

imohitkr commented 2 weeks ago

I am seeing the same issue. Cleared the model cache just to be sure, but no change. The normal (CPU) image runs just fine 👍 Funny enough I have the same CPU as the OP (Intel(R) Pentium(R) Gold 8505, Ugreen NAS).

I have the same ugreen nas.

mertalev commented 2 weeks ago

I'm not sure if this is a general issue with OpenVINO's handling of external data, or if there's something particular to this environment. I can do some testing and make an upstream issue about it.

mertalev commented 2 weeks ago

For now, your options are to either continue using the 1.116.2 image or to switch to a model that doesn't use external data (ViT-SO400M-14-SigLIP-384__webli is the best option for this, with quality very similar to ViT-H-14-378-quickgelu__dfn5b).

imohitkr commented 2 weeks ago

For now, your options are to either continue using the 1.18.0 image or to switch to a model that doesn't use external data (ViT-SO400M-14-SigLIP-384__webli is the best option for this, with quality very similar to ViT-H-14-378-quickgelu__dfn5b).

I've decided to stick with the 1.18.0 images for now because it took my library 4 days to finish the smart search job with 'ViT-H-14-378-quickgelu__dfn5b', and I really don't want to have to run it again. I will keep a watch on this page for resolution.

Thank you for taking quick look.

Saschl commented 2 weeks ago

ViT-SO400M-14-SigLIP-384__webli

With that image it works with the latest machine-learning image tag (v1.117.0 with openvino 1.19).

Soulplayer commented 1 week ago

I’m having the same issue. I’ll wait a bit before downgrading if it is not an easy fix. @imohitkr could you post the image you are using?

imohitkr commented 1 week ago

@imohitkr could you post the image you are using?

I am using ghcr.io/immich-app/immich-machine-learning:v1.116.2-openvino with the latest immich server images.

imohitkr commented 1 week ago

I’m having the same issue. I’ll wait a bit before downgrading if it is not an easy fix. @imohitkr could you post the image you are using?

Do you also have same cpu model ?

Soulplayer commented 1 week ago

No but perhaps the same igpu (770?). I have an i5-14500

haiquand commented 1 week ago

Having the same issue when I using Search. With ghcr.io/immich-app/immich-machine-learning:v1.117.0-openvino and multilingual CLIP XLM-Roberta-Large-Vit-B-16Plus model. My device is Intel(R) Celeron(R) N5095 QNAP TS-264C.

[10/13/24 22:09:11] INFO     Loading textual model                              
                             'XLM-Roberta-Large-Vit-B-16Plus' to memory         
[10/13/24 22:09:11] INFO     Setting execution providers to                     
                             ['OpenVINOExecutionProvider',                      
                             'CPUExecutionProvider'], in descending order of    
                             preference                                         
2024-10-13 22:09:11.417140061 [E:onnxruntime:, inference_session.cc:2105 operator()] Exception during initialization: /onnxruntime/onnxruntime/core/providers/openvino/backend_manager.cc:126 onnxruntime::openvino_ep::BackendManager::BackendManager(const onnxruntime::openvino_ep::GlobalContext&, const onnxruntime::Node&, const onnxruntime::GraphViewer&, const onnxruntime::logging::Logger&, onnxruntime::openvino_ep::EPCtxHandler&) /onnxruntime/onnxruntime/core/providers/openvino/ov_interface.cc:106 onnxruntime::openvino_ep::OVExeNetwork onnxruntime::openvino_ep::OVCore::CompileModel(const std::string&, std::string&, ov::AnyMap&, const std::string&) [OpenVINO-EP]  Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_1_0Exception from src/inference/src/cpp/core.cpp:142:
invalid external data: ExternalDataInfo(data_full_path: c1f385d0-4888-11ef-a23b-00155d655292, offset: 1026113536, data_length: 2621440)
[10/13/24 22:09:11] ERROR    Exception in ASGI application                      

                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /usr/src/app/main.py:152 in predict             │
                             │                                                 │
                             │   149 │   │   inputs = text                     │
                             │   150 │   else:                                 │
                             │   151 │   │   raise HTTPException(400, "Either  │
                             │ ❱ 152 │   response = await run_inference(inputs │
                             │   153 │   return ORJSONResponse(response)       │
                             │   154                                           │
                             │   155                                           │
                             │                                                 │
                             │ /usr/src/app/main.py:175 in run_inference       │
                             │                                                 │
                             │   172 │   │   response[entry["task"]] = output  │
                             │   173 │                                         │
                             │   174 │   without_deps, with_deps = entries     │
                             │ ❱ 175 │   await asyncio.gather(*[_run_inference │
                             │   176 │   if with_deps:                         │
                             │   177 │   │   await asyncio.gather(*[_run_infer │
                             │   178 │   if isinstance(payload, Image):        │
                             │                                                 │
                             │ /usr/src/app/main.py:169 in _run_inference      │
                             │                                                 │
                             │   166 │   │   │   except KeyError:              │
                             │   167 │   │   │   │   message = f"Task {entry[' │
                             │       output of {dep}"                          │
                             │   168 │   │   │   │   raise HTTPException(400,  │
                             │ ❱ 169 │   │   model = await load(model)         │
                             │   170 │   │   output = await run(model.predict, │
                             │   171 │   │   outputs[model.identity] = output  │
                             │   172 │   │   response[entry["task"]] = output  │
                             │                                                 │
                             │ /usr/src/app/main.py:213 in load                │
                             │                                                 │
                             │   210 │   │   return model                      │
                             │   211 │                                         │
                             │   212 │   try:                                  │
                             │ ❱ 213 │   │   return await run(_load, model)    │
                             │   214 │   except (OSError, InvalidProtobuf, Bad │
                             │   215 │   │   log.warning(f"Failed to load {mod │
                             │       '{model.model_name}'. Clearing cache.")   │
                             │   216 │   │   model.clear_cache()               │
                             │                                                 │
                             │ /usr/src/app/main.py:188 in run                 │
                             │                                                 │
                             │   185 │   if thread_pool is None:               │
                             │   186 │   │   return func(*args, **kwargs)      │
                             │   187 │   partial_func = partial(func, *args, * │
                             │ ❱ 188 │   return await asyncio.get_running_loop │
                             │   189                                           │
                             │   190                                           │
                             │   191 async def load(model: InferenceModel) ->  │
                             │                                                 │
                             │ /usr/local/lib/python3.11/concurrent/futures/th │
                             │ read.py:58 in run                               │
                             │                                                 │
                             │ /usr/src/app/main.py:200 in _load               │
                             │                                                 │
                             │   197 │   │   │   raise HTTPException(500, f"Fa │
                             │   198 │   │   with lock:                        │
                             │   199 │   │   │   try:                          │
                             │ ❱ 200 │   │   │   │   model.load()              │
                             │   201 │   │   │   except FileNotFoundError as e │
                             │   202 │   │   │   │   if model.model_format ==  │
                             │   203 │   │   │   │   │   raise e               │
                             │                                                 │
                             │ /usr/src/app/models/base.py:53 in load          │
                             │                                                 │
                             │    50 │   │   self.download()                   │
                             │    51 │   │   attempt = f"Attempt #{self.load_a │
                             │       else "Loading"                            │
                             │    52 │   │   log.info(f"{attempt} {self.model_ │
                             │       '{self.model_name}' to memory")           │
                             │ ❱  53 │   │   self.session = self._load()       │
                             │    54 │   │   self.loaded = True                │
                             │    55 │                                         │
                             │    56 │   def predict(self, *inputs: Any, **mod │
                             │                                                 │
                             │ /usr/src/app/models/clip/textual.py:26 in _load │
                             │                                                 │
                             │    23 │   │   return res                        │
                             │    24 │                                         │
                             │    25 │   def _load(self) -> ModelSession:      │
                             │ ❱  26 │   │   session = super()._load()         │
                             │    27 │   │   log.debug(f"Loading tokenizer for │
                             │    28 │   │   self.tokenizer = self._load_token │
                             │    29 │   │   tokenizer_kwargs: dict[str, Any]  │
                             │                                                 │
                             │ /usr/src/app/models/base.py:78 in _load         │
                             │                                                 │
                             │    75 │   │   )                                 │
                             │    76 │                                         │
                             │    77 │   def _load(self) -> ModelSession:      │
                             │ ❱  78 │   │   return self._make_session(self.mo │
                             │    79 │                                         │
                             │    80 │   def clear_cache(self) -> None:        │
                             │    81 │   │   if not self.cache_dir.exists():   │
                             │                                                 │
                             │ /usr/src/app/models/base.py:110 in              │
                             │ _make_session                                   │
                             │                                                 │
                             │   107 │   │   │   case ".armnn":                │
                             │   108 │   │   │   │   session: ModelSession = A │
                             │   109 │   │   │   case ".onnx":                 │
                             │ ❱ 110 │   │   │   │   session = OrtSession(mode │
                             │   111 │   │   │   case _:                       │
                             │   112 │   │   │   │   raise ValueError(f"Unsupp │
                             │   113 │   │   return session                    │
                             │                                                 │
                             │ /usr/src/app/sessions/ort.py:28 in __init__     │
                             │                                                 │
                             │    25 │   │   self.providers = providers if pro │
                             │    26 │   │   self.provider_options = provider_ │
                             │       self._provider_options_default            │
                             │    27 │   │   self.sess_options = sess_options  │
                             │       self._sess_options_default                │
                             │ ❱  28 │   │   self.session = ort.InferenceSessi │
                             │    29 │   │   │   self.model_path.as_posix(),   │
                             │    30 │   │   │   providers=self.providers,     │
                             │    31 │   │   │   provider_options=self.provide │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:41 │
                             │ 9 in __init__                                   │
                             │                                                 │
                             │    416 │   │   disabled_optimizers = kwargs.get │
                             │    417 │   │                                    │
                             │    418 │   │   try:                             │
                             │ ❱  419 │   │   │   self._create_inference_sessi │
                             │        disabled_optimizers)                     │
                             │    420 │   │   except (ValueError, RuntimeError │
                             │    421 │   │   │   if self._enable_fallback:    │
                             │    422 │   │   │   │   try:                     │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:49 │
                             │ 1 in _create_inference_session                  │
                             │                                                 │
                             │    488 │   │   │   disabled_optimizers = set(di │
                             │    489 │   │                                    │
                             │    490 │   │   # initialize the C++ InferenceSe │
                             │ ❱  491 │   │   sess.initialize_session(provider │
                             │    492 │   │                                    │
                             │    493 │   │   self._sess = sess                │
                             │    494 │   │   self._sess_options = self._sess. │
                             ╰─────────────────────────────────────────────────╯
                             RuntimeException: [ONNXRuntimeError] : 6 :         
                             RUNTIME_EXCEPTION : Exception during               
                             initialization:                                    
                             /onnxruntime/onnxruntime/core/providers/openvino/ba
                             ckend_manager.cc:126                               
                             onnxruntime::openvino_ep::BackendManager::BackendMa
                             nager(const                                        
                             onnxruntime::openvino_ep::GlobalContext&, const    
                             onnxruntime::Node&, const                          
                             onnxruntime::GraphViewer&, const                   
                             onnxruntime::logging::Logger&,                     
                             onnxruntime::openvino_ep::EPCtxHandler&)           
                             /onnxruntime/onnxruntime/core/providers/openvino/ov
                             _interface.cc:106                                  
                             onnxruntime::openvino_ep::OVExeNetwork             
                             onnxruntime::openvino_ep::OVCore::CompileModel(cons
                             t std::string&, std::string&, ov::AnyMap&, const   
                             std::string&) [OpenVINO-EP]  Exception while       
                             Loading Network for graph:                         
                             OpenVINOExecutionProvider_OpenVINO-EP-subgraph_1_0E
                             xception from src/inference/src/cpp/core.cpp:142:  
                             invalid external data:                             
                             ExternalDataInfo(data_full_path:                   
                             c1f385d0-4888-11ef-a23b-00155d655292, offset:      
                             1026113536, data_length: 2621440)

But these is no issue with ViT-B-32__openai model.

Saschl commented 1 week ago

https://github.com/immich-app/immich/pull/13290 fixes this issue for now 👍 Next release will work again.

mertalev commented 1 week ago

It's just a temporary fix for now since we can't just not update onnxruntime-openvino. I hope I can reproduce this when I test OpenVINO locally, or it's going to be a pain to know if/when it's safe to update it 😅

Saschl commented 1 week ago

I'm happy to assist in testing if needed :)

emptinessboy commented 1 week ago

I meat same problem

imohitkr commented 1 week ago

It's just a temporary fix for now since we can't just not update onnxruntime-openvino. I hope I can reproduce this when I test OpenVINO locally, or it's going to be a pain to know if/when it's safe to update it 😅

Happy to help with testing too.

immich-app / immich