Closed imohitkr closed 1 week ago
Are you certain that's the image you've been using? Because it's from a PR that doesn't have any ML changes in it.
Are you certain that's the image you've been using? Because it's from a PR that doesn't have any ML changes in it.
Yes, one of the releases failed to generate the machine learning open vino image tag. As a result, I had to use one of the pr tagged versions instead of waiting for a re-release. Since there were no changes between the last release and this PR tagged version, I decided to go with it.
Can you try with MACHINE_LEARNING_WORKERS=3
commented out?
Can you try with
MACHINE_LEARNING_WORKERS=3
commented out?
Same issue when I comment it out.
I am seeing the same issue. Cleared the model cache just to be sure, but no change. The normal (CPU) image runs just fine 👍 Funny enough I have the same CPU as the OP (Intel(R) Pentium(R) Gold 8505, Ugreen NAS).
With the tag v1.116.2-openvino
it also works again.
[10/06/24 23:10:02] INFO Setting execution providers to
['OpenVINOExecutionProvider',
'CPUExecutionProvider'], in descending order of
preference
2024-10-06 23:10:02.909341676 [E:onnxruntime:, inference_session.cc:2105 operator()] Exception during initialization: /onnxruntime/onnxruntime/core/providers/openvino/backend_manager.cc:126 onnxruntime::openvino_ep::BackendManager::BackendManager(const onnxruntime::openvino_ep::GlobalContext&, const onnxruntime::Node&, const onnxruntime::GraphViewer&, const onnxruntime::logging::Logger&, onnxruntime::openvino_ep::EPCtxHandler&) /onnxruntime/onnxruntime/core/providers/openvino/ov_interface.cc:106 onnxruntime::openvino_ep::OVExeNetwork onnxruntime::openvino_ep::OVCore::CompileModel(const std::string&, std::string&, ov::AnyMap&, const std::string&) [OpenVINO-EP] Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_2_0Exception from src/inference/src/cpp/core.cpp:142:
invalid external data: ExternalDataInfo(data_full_path: 51cc7752-47e9-11ef-91e8-00155d655292, offset: 9564160, data_length: 3737600)
[10/06/24 23:10:02] ERROR Exception in ASGI application
...same stack trace as abve
RuntimeException: [ONNXRuntimeError] : 6 :
RUNTIME_EXCEPTION : Exception during
initialization:
/onnxruntime/onnxruntime/core/providers/openvino/ba
ckend_manager.cc:126
onnxruntime::openvino_ep::BackendManager::BackendMa
nager(const
onnxruntime::openvino_ep::GlobalContext&, const
onnxruntime::Node&, const
onnxruntime::GraphViewer&, const
onnxruntime::logging::Logger&,
onnxruntime::openvino_ep::EPCtxHandler&)
/onnxruntime/onnxruntime/core/providers/openvino/ov
_interface.cc:106
onnxruntime::openvino_ep::OVExeNetwork
onnxruntime::openvino_ep::OVCore::CompileModel(cons
t std::string&, std::string&, ov::AnyMap&, const
std::string&) [OpenVINO-EP] Exception while
Loading Network for graph:
OpenVINOExecutionProvider_OpenVINO-EP-subgraph_2_0E
xception from src/inference/src/cpp/core.cpp:142:
invalid external data:
ExternalDataInfo(data_full_path:
51cc7752-47e9-11ef-91e8-00155d655292, offset:
9564160, data_length: 3737600)
I narrowed it down to this pr : https://github.com/immich-app/immich/pull/12883 , the image built with its tag is where the problem started.
I am seeing the same issue. Cleared the model cache just to be sure, but no change. The normal (CPU) image runs just fine 👍 Funny enough I have the same CPU as the OP (Intel(R) Pentium(R) Gold 8505, Ugreen NAS).
I have the same ugreen nas.
I'm not sure if this is a general issue with OpenVINO's handling of external data, or if there's something particular to this environment. I can do some testing and make an upstream issue about it.
For now, your options are to either continue using the 1.116.2 image or to switch to a model that doesn't use external data (ViT-SO400M-14-SigLIP-384__webli
is the best option for this, with quality very similar to ViT-H-14-378-quickgelu__dfn5b
).
For now, your options are to either continue using the 1.18.0 image or to switch to a model that doesn't use external data (
ViT-SO400M-14-SigLIP-384__webli
is the best option for this, with quality very similar toViT-H-14-378-quickgelu__dfn5b
).
I've decided to stick with the 1.18.0 images for now because it took my library 4 days to finish the smart search job with 'ViT-H-14-378-quickgelu__dfn5b', and I really don't want to have to run it again. I will keep a watch on this page for resolution.
Thank you for taking quick look.
ViT-SO400M-14-SigLIP-384__webli
With that image it works with the latest machine-learning image tag (v1.117.0 with openvino 1.19).
I’m having the same issue. I’ll wait a bit before downgrading if it is not an easy fix. @imohitkr could you post the image you are using?
@imohitkr could you post the image you are using?
I am using ghcr.io/immich-app/immich-machine-learning:v1.116.2-openvino
with the latest immich server images.
I’m having the same issue. I’ll wait a bit before downgrading if it is not an easy fix. @imohitkr could you post the image you are using?
Do you also have same cpu model ?
No but perhaps the same igpu (770?). I have an i5-14500
Having the same issue when I using Search
. With ghcr.io/immich-app/immich-machine-learning:v1.117.0-openvino
and multilingual CLIP XLM-Roberta-Large-Vit-B-16Plus
model. My device is Intel(R) Celeron(R) N5095 QNAP TS-264C.
[10/13/24 22:09:11] INFO Loading textual model
'XLM-Roberta-Large-Vit-B-16Plus' to memory
[10/13/24 22:09:11] INFO Setting execution providers to
['OpenVINOExecutionProvider',
'CPUExecutionProvider'], in descending order of
preference
2024-10-13 22:09:11.417140061 [E:onnxruntime:, inference_session.cc:2105 operator()] Exception during initialization: /onnxruntime/onnxruntime/core/providers/openvino/backend_manager.cc:126 onnxruntime::openvino_ep::BackendManager::BackendManager(const onnxruntime::openvino_ep::GlobalContext&, const onnxruntime::Node&, const onnxruntime::GraphViewer&, const onnxruntime::logging::Logger&, onnxruntime::openvino_ep::EPCtxHandler&) /onnxruntime/onnxruntime/core/providers/openvino/ov_interface.cc:106 onnxruntime::openvino_ep::OVExeNetwork onnxruntime::openvino_ep::OVCore::CompileModel(const std::string&, std::string&, ov::AnyMap&, const std::string&) [OpenVINO-EP] Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_1_0Exception from src/inference/src/cpp/core.cpp:142:
invalid external data: ExternalDataInfo(data_full_path: c1f385d0-4888-11ef-a23b-00155d655292, offset: 1026113536, data_length: 2621440)
[10/13/24 22:09:11] ERROR Exception in ASGI application
╭─────── Traceback (most recent call last) ───────╮
│ /usr/src/app/main.py:152 in predict │
│ │
│ 149 │ │ inputs = text │
│ 150 │ else: │
│ 151 │ │ raise HTTPException(400, "Either │
│ ❱ 152 │ response = await run_inference(inputs │
│ 153 │ return ORJSONResponse(response) │
│ 154 │
│ 155 │
│ │
│ /usr/src/app/main.py:175 in run_inference │
│ │
│ 172 │ │ response[entry["task"]] = output │
│ 173 │ │
│ 174 │ without_deps, with_deps = entries │
│ ❱ 175 │ await asyncio.gather(*[_run_inference │
│ 176 │ if with_deps: │
│ 177 │ │ await asyncio.gather(*[_run_infer │
│ 178 │ if isinstance(payload, Image): │
│ │
│ /usr/src/app/main.py:169 in _run_inference │
│ │
│ 166 │ │ │ except KeyError: │
│ 167 │ │ │ │ message = f"Task {entry[' │
│ output of {dep}" │
│ 168 │ │ │ │ raise HTTPException(400, │
│ ❱ 169 │ │ model = await load(model) │
│ 170 │ │ output = await run(model.predict, │
│ 171 │ │ outputs[model.identity] = output │
│ 172 │ │ response[entry["task"]] = output │
│ │
│ /usr/src/app/main.py:213 in load │
│ │
│ 210 │ │ return model │
│ 211 │ │
│ 212 │ try: │
│ ❱ 213 │ │ return await run(_load, model) │
│ 214 │ except (OSError, InvalidProtobuf, Bad │
│ 215 │ │ log.warning(f"Failed to load {mod │
│ '{model.model_name}'. Clearing cache.") │
│ 216 │ │ model.clear_cache() │
│ │
│ /usr/src/app/main.py:188 in run │
│ │
│ 185 │ if thread_pool is None: │
│ 186 │ │ return func(*args, **kwargs) │
│ 187 │ partial_func = partial(func, *args, * │
│ ❱ 188 │ return await asyncio.get_running_loop │
│ 189 │
│ 190 │
│ 191 async def load(model: InferenceModel) -> │
│ │
│ /usr/local/lib/python3.11/concurrent/futures/th │
│ read.py:58 in run │
│ │
│ /usr/src/app/main.py:200 in _load │
│ │
│ 197 │ │ │ raise HTTPException(500, f"Fa │
│ 198 │ │ with lock: │
│ 199 │ │ │ try: │
│ ❱ 200 │ │ │ │ model.load() │
│ 201 │ │ │ except FileNotFoundError as e │
│ 202 │ │ │ │ if model.model_format == │
│ 203 │ │ │ │ │ raise e │
│ │
│ /usr/src/app/models/base.py:53 in load │
│ │
│ 50 │ │ self.download() │
│ 51 │ │ attempt = f"Attempt #{self.load_a │
│ else "Loading" │
│ 52 │ │ log.info(f"{attempt} {self.model_ │
│ '{self.model_name}' to memory") │
│ ❱ 53 │ │ self.session = self._load() │
│ 54 │ │ self.loaded = True │
│ 55 │ │
│ 56 │ def predict(self, *inputs: Any, **mod │
│ │
│ /usr/src/app/models/clip/textual.py:26 in _load │
│ │
│ 23 │ │ return res │
│ 24 │ │
│ 25 │ def _load(self) -> ModelSession: │
│ ❱ 26 │ │ session = super()._load() │
│ 27 │ │ log.debug(f"Loading tokenizer for │
│ 28 │ │ self.tokenizer = self._load_token │
│ 29 │ │ tokenizer_kwargs: dict[str, Any] │
│ │
│ /usr/src/app/models/base.py:78 in _load │
│ │
│ 75 │ │ ) │
│ 76 │ │
│ 77 │ def _load(self) -> ModelSession: │
│ ❱ 78 │ │ return self._make_session(self.mo │
│ 79 │ │
│ 80 │ def clear_cache(self) -> None: │
│ 81 │ │ if not self.cache_dir.exists(): │
│ │
│ /usr/src/app/models/base.py:110 in │
│ _make_session │
│ │
│ 107 │ │ │ case ".armnn": │
│ 108 │ │ │ │ session: ModelSession = A │
│ 109 │ │ │ case ".onnx": │
│ ❱ 110 │ │ │ │ session = OrtSession(mode │
│ 111 │ │ │ case _: │
│ 112 │ │ │ │ raise ValueError(f"Unsupp │
│ 113 │ │ return session │
│ │
│ /usr/src/app/sessions/ort.py:28 in __init__ │
│ │
│ 25 │ │ self.providers = providers if pro │
│ 26 │ │ self.provider_options = provider_ │
│ self._provider_options_default │
│ 27 │ │ self.sess_options = sess_options │
│ self._sess_options_default │
│ ❱ 28 │ │ self.session = ort.InferenceSessi │
│ 29 │ │ │ self.model_path.as_posix(), │
│ 30 │ │ │ providers=self.providers, │
│ 31 │ │ │ provider_options=self.provide │
│ │
│ /opt/venv/lib/python3.11/site-packages/onnxrunt │
│ ime/capi/onnxruntime_inference_collection.py:41 │
│ 9 in __init__ │
│ │
│ 416 │ │ disabled_optimizers = kwargs.get │
│ 417 │ │ │
│ 418 │ │ try: │
│ ❱ 419 │ │ │ self._create_inference_sessi │
│ disabled_optimizers) │
│ 420 │ │ except (ValueError, RuntimeError │
│ 421 │ │ │ if self._enable_fallback: │
│ 422 │ │ │ │ try: │
│ │
│ /opt/venv/lib/python3.11/site-packages/onnxrunt │
│ ime/capi/onnxruntime_inference_collection.py:49 │
│ 1 in _create_inference_session │
│ │
│ 488 │ │ │ disabled_optimizers = set(di │
│ 489 │ │ │
│ 490 │ │ # initialize the C++ InferenceSe │
│ ❱ 491 │ │ sess.initialize_session(provider │
│ 492 │ │ │
│ 493 │ │ self._sess = sess │
│ 494 │ │ self._sess_options = self._sess. │
╰─────────────────────────────────────────────────╯
RuntimeException: [ONNXRuntimeError] : 6 :
RUNTIME_EXCEPTION : Exception during
initialization:
/onnxruntime/onnxruntime/core/providers/openvino/ba
ckend_manager.cc:126
onnxruntime::openvino_ep::BackendManager::BackendMa
nager(const
onnxruntime::openvino_ep::GlobalContext&, const
onnxruntime::Node&, const
onnxruntime::GraphViewer&, const
onnxruntime::logging::Logger&,
onnxruntime::openvino_ep::EPCtxHandler&)
/onnxruntime/onnxruntime/core/providers/openvino/ov
_interface.cc:106
onnxruntime::openvino_ep::OVExeNetwork
onnxruntime::openvino_ep::OVCore::CompileModel(cons
t std::string&, std::string&, ov::AnyMap&, const
std::string&) [OpenVINO-EP] Exception while
Loading Network for graph:
OpenVINOExecutionProvider_OpenVINO-EP-subgraph_1_0E
xception from src/inference/src/cpp/core.cpp:142:
invalid external data:
ExternalDataInfo(data_full_path:
c1f385d0-4888-11ef-a23b-00155d655292, offset:
1026113536, data_length: 2621440)
But these is no issue with ViT-B-32__openai
model.
https://github.com/immich-app/immich/pull/13290 fixes this issue for now 👍 Next release will work again.
It's just a temporary fix for now since we can't just not update onnxruntime-openvino. I hope I can reproduce this when I test OpenVINO locally, or it's going to be a pain to know if/when it's safe to update it 😅
I'm happy to assist in testing if needed :)
I meat same problem
It's just a temporary fix for now since we can't just not update onnxruntime-openvino. I hope I can reproduce this when I test OpenVINO locally, or it's going to be a pain to know if/when it's safe to update it 😅
Happy to help with testing too.
The bug
Hi,
Thank you for the update, but it seems that the smart search job is broken.
I am running the OpenVINO image, and everything works fine with the image built with the tag "ghcr.io/immich-app/immich-machine-learning:pr-11453-openvino". However, with the latest image tagged "ghcr.io/immich-app/immich-machine-learning:v1.117.0-openvino", I am getting the errors listed below. When I pair the latest version with the old machine learning server image, everything works fine.
Interestingly the "text" based search works ,where it loads the textual model , with the latest machine-learning-server image.
Please help.
The OS that Immich Server is running on
Debian GNU/Linux 12 (bookworm)
Version of Immich Server
v1.117.0
Version of Immich Mobile App
v1.117.0
Platform with the issue
Your docker-compose.yml content
Your .env content
Reproduction steps
Relevant log output
Additional information
lscpu output :
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 6 On-line CPU(s) list: 0-5 Vendor ID: GenuineIntel Model name: Intel(R) Pentium(R) Gold 8505 CPU family: 6 Model: 154 Thread(s) per core: 2 Core(s) per socket: 5 Socket(s): 1 Stepping: 4 CPU(s) scaling MHz: 75% CPU max MHz: 4400.0000 CPU min MHz: 400.0000 BogoMIPS: 4992.00 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe po pcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tprshadow vnmi flexpriority ept vpid ept ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb inte l_pt sha_ni xsaveopt xsavec xgetbv1 xsaves split_lock_detect avx_vnni dtherm ida arat pln pt s hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req hfi umip pku ospke waitpkg gfni vaes vpc lmulqdq rdpid movdiri movdir64b fsrm md_clear serialize arch_lbr ibt flush_l1d arch_capabili ties Virtualization features: Virtualization: VT-x Caches (sum of all):
L1d: 176 KiB (5 instances) L1i: 288 KiB (5 instances) L2: 3.3 MiB (2 instances) L3: 8 MiB (1 instance) NUMA:
NUMA node(s): 1 NUMA node0 CPU(s): 0-5 Vulnerabilities:
Itlb multihit: Not affected L1tf: Not affected Mds: Not affected Meltdown: Not affected Mmio stale data: Not affected Retbleed: Not affected Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence Srbds: Not affected Tsx async abort: Not affected