Open xemxx opened 6 days ago
Is this running in a VM? Can you give some more detail about that?
i have same problem。pve with sriov
[10/25/24 10:18:12] INFO Starting gunicorn 23.0.0
[10/25/24 10:18:12] INFO Listening at: http://[::]:3003 (9)
[10/25/24 10:18:12] INFO Using worker: app.config.CustomUvicornWorker
[10/25/24 10:18:12] INFO Booting worker with pid: 10
[10/25/24 10:18:16] INFO Started server process [10]
[10/25/24 10:18:16] INFO Waiting for application startup.
[10/25/24 10:18:16] INFO Created in-memory cache with unloading after 300s
of inactivity.
[10/25/24 10:18:16] INFO Initialized request thread pool with 4 threads.
[10/25/24 10:18:16] INFO Application startup complete.
[10/25/24 10:18:46] INFO Loading detection model 'antelopev2' to memory
[10/25/24 10:18:46] INFO Setting execution providers to
['OpenVINOExecutionProvider',
'CPUExecutionProvider'], in descending order of
preference
2024-10-25 10:19:04.571404921 [E:onnxruntime:, inference_session.cc:2045 operator()] Exception during initialization: /onnxruntime/onnxruntime/core/providers/openvino/ov_interface.cc:85 onnxruntime::openvino_ep::OVExeNetwork onnxruntime::openvino_ep::OVCore::CompileModel(std::shared_ptr<const ov::Model>&, std::__cxx11::string&, ov::AnyMap&, std::__cxx11::string) [OpenVINO-EP] Exception while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgraph_1_0Exception from src/inference/src/cpp/core.cpp:109:
Exception from src/inference/src/dev/plugin.cpp:54:
Exception from src/plugins/intel_gpu/src/runtime/ocl/ocl_stream.cpp:433:
[GPU] clWaitForEvents, error code: -14
[10/25/24 10:19:04] ERROR Exception in ASGI application
╭─────── Traceback (most recent call last) ───────╮
│ /usr/src/app/main.py:150 in predict │
│ │
│ 147 │ │ inputs = text │
│ 148 │ else: │
│ 149 │ │ raise HTTPException(400, "Either │
│ ❱ 150 │ response = await run_inference(inputs │
│ 151 │ return ORJSONResponse(response) │
│ 152 │
│ 153 │
│ │
│ /usr/src/app/main.py:173 in run_inference │
│ │
│ 170 │ │ response[entry["task"]] = output │
│ 171 │ │
│ 172 │ without_deps, with_deps = entries │
│ ❱ 173 │ await asyncio.gather(*[_run_inference │
│ 174 │ if with_deps: │
│ 175 │ │ await asyncio.gather(*[_run_infer │
│ 176 │ if isinstance(payload, Image): │
│ │
│ /usr/src/app/main.py:167 in _run_inference │
│ │
│ 164 │ │ │ except KeyError: │
│ 165 │ │ │ │ message = f"Task {entry[' │
│ output of {dep}" │
│ 166 │ │ │ │ raise HTTPException(400, │
│ ❱ 167 │ │ model = await load(model) │
│ 168 │ │ output = await run(model.predict, │
│ 169 │ │ outputs[model.identity] = output │
│ 170 │ │ response[entry["task"]] = output │
│ │
│ /usr/src/app/main.py:211 in load │
│ │
│ 208 │ │ return model │
│ 209 │ │
│ 210 │ try: │
│ ❱ 211 │ │ return await run(_load, model) │
│ 212 │ except (OSError, InvalidProtobuf, Bad │
│ 213 │ │ log.warning(f"Failed to load {mod │
│ '{model.model_name}'. Clearing cache.") │
│ 214 │ │ model.clear_cache() │
│ │
│ /usr/src/app/main.py:186 in run │
│ │
│ 183 │ if thread_pool is None: │
│ 184 │ │ return func(*args, **kwargs) │
│ 185 │ partial_func = partial(func, *args, * │
│ ❱ 186 │ return await asyncio.get_running_loop │
│ 187 │
│ 188 │
│ 189 async def load(model: InferenceModel) -> │
│ │
│ /usr/local/lib/python3.11/concurrent/futures/th │
│ read.py:58 in run │
│ │
│ /usr/src/app/main.py:198 in _load │
│ │
│ 195 │ │ │ raise HTTPException(500, f"Fa │
│ 196 │ │ with lock: │
│ 197 │ │ │ try: │
│ ❱ 198 │ │ │ │ model.load() │
│ 199 │ │ │ except FileNotFoundError as e │
│ 200 │ │ │ │ if model.model_format == │
│ 201 │ │ │ │ │ raise e │
│ │
│ /usr/src/app/models/base.py:53 in load │
│ │
│ 50 │ │ self.download() │
│ 51 │ │ attempt = f"Attempt #{self.load_a │
│ else "Loading" │
│ 52 │ │ log.info(f"{attempt} {self.model_ │
│ '{self.model_name}' to memory") │
│ ❱ 53 │ │ self.session = self._load() │
│ 54 │ │ self.loaded = True │
│ 55 │ │
│ 56 │ def predict(self, *inputs: Any, **mod │
│ │
│ /usr/src/app/models/facial_recognition/detectio │
│ n.py:21 in _load │
│ │
│ 18 │ │ super().__init__(model_name, **mod │
│ 19 │ │
│ 20 │ def _load(self) -> ModelSession: │
│ ❱ 21 │ │ session = self._make_session(self. │
│ 22 │ │ self.model = RetinaFace(session=se │
│ 23 │ │ self.model.prepare(ctx_id=0, det_t │
│ 24 │
│ │
│ /usr/src/app/models/base.py:110 in │
│ _make_session │
│ │
│ 107 │ │ │ case ".armnn": │
│ 108 │ │ │ │ session: ModelSession = A │
│ 109 │ │ │ case ".onnx": │
│ ❱ 110 │ │ │ │ session = OrtSession(mode │
│ 111 │ │ │ case _: │
│ 112 │ │ │ │ raise ValueError(f"Unsupp │
│ 113 │ │ return session │
│ │
│ /usr/src/app/sessions/ort.py:28 in __init__ │
│ │
│ 25 │ │ self.providers = providers if pro │
│ 26 │ │ self.provider_options = provider_ │
│ self._provider_options_default │
│ 27 │ │ self.sess_options = sess_options │
│ self._sess_options_default │
│ ❱ 28 │ │ self.session = ort.InferenceSessi │
│ 29 │ │ │ self.model_path.as_posix(), │
│ 30 │ │ │ providers=self.providers, │
│ 31 │ │ │ provider_options=self.provide │
│ │
│ /opt/venv/lib/python3.11/site-packages/onnxrunt │
│ ime/capi/onnxruntime_inference_collection.py:41 │
│ 9 in __init__ │
│ │
│ 416 │ │ disabled_optimizers = kwargs.get │
│ 417 │ │ │
│ 418 │ │ try: │
│ ❱ 419 │ │ │ self._create_inference_sessi │
│ disabled_optimizers) │
│ 420 │ │ except (ValueError, RuntimeError │
│ 421 │ │ │ if self._enable_fallback: │
│ 422 │ │ │ │ try: │
│ │
│ /opt/venv/lib/python3.11/site-packages/onnxrunt │
│ ime/capi/onnxruntime_inference_collection.py:48 │
│ 3 in _create_inference_session │
│ │
│ 480 │ │ │ disabled_optimizers = set(di │
│ 481 │ │ │
│ 482 │ │ # initialize the C++ InferenceSe │
│ ❱ 483 │ │ sess.initialize_session(provider │
│ 484 │ │ │
│ 485 │ │ self._sess = sess │
│ 486 │ │ self._sess_options = self._sess. │
╰─────────────────────────────────────────────────╯
RuntimeException: [ONNXRuntimeError] : 6 :
RUNTIME_EXCEPTION : Exception during
initialization:
/onnxruntime/onnxruntime/core/providers/openvino/ov
_interface.cc:85
onnxruntime::openvino_ep::OVExeNetwork
onnxruntime::openvino_ep::OVCore::CompileModel(std:
:shared_ptr<const ov::Model>&,
std::__cxx11::string&, ov::AnyMap&,
std::__cxx11::string) [OpenVINO-EP] Exception
while Loading Network for graph:
OpenVINOExecutionProvider_OpenVINO-EP-subgraph_1_0E
xception from src/inference/src/cpp/core.cpp:109:
Exception from src/inference/src/dev/plugin.cpp:54:
Exception from
src/plugins/intel_gpu/src/runtime/ocl/ocl_stream.cp
p:433:
[GPU] clWaitForEvents, error code: -14
Is this running in a VM? Can you give some more detail about that?
Yes, my physical machine runs PVE and is configured with sriov to share HD730 cores, then I run a VM with docker installed to run immich. This VM is also running jellyfin and works fine with hardware acceleration!
cc @mertalev
hardware-transcoding is fine,but Hardware-Accelerated Machine Learning with sriov not work
The bug
I'm trying to enable machine learning for smart search, but the machine-learning container is reporting an error and doesn't work.
The OS that Immich Server is running on
DSM with sriov base on PVE 8.2, I3-12300T 16G RAM
Version of Immich Server
v1.118.2
Version of Immich Mobile App
v1.118.2
Platform with the issue
Your docker-compose.yml content
Your .env content
Reproduction steps
Relevant log output
Additional information
My PVE is 16G of RAM and offers 8G to the DSM
I should have enough memory in the run, here is my grafana diagram
Before the error log output, my GPU was loaded