immich-app / immich

High performance self-hosted photo and video management solution.
https://immich.app
GNU Affero General Public License v3.0
53k stars 2.81k forks source link

OpenVino: facial detection is broken with 1.99.0 #8226

Closed LeoAdL closed 4 months ago

LeoAdL commented 8 months ago

The bug

When I try launching the face detection, whatever model I use, I get the following error:

immich_machine_learning  | [03/23/24 22:46:05] INFO     Setting 'buffalo_l' execution providers to         
immich_machine_learning  |                              ['OpenVINOExecutionProvider',                      
immich_machine_learning  |                              'CPUExecutionProvider'], in descending order of    
immich_machine_learning  |                              preference                                         
immich_machine_learning  | [03/23/24 22:46:05] INFO     Loading facial recognition model 'buffalo_l' to    
immich_machine_learning  |                              memory                                             
immich_machine_learning  | [03/23/24 22:46:08] ERROR    Exception in ASGI application                      
immich_machine_learning  |                                                                                 
immich_machine_learning  |                              ╭─────── Traceback (most recent call last) ───────╮
immich_machine_learning  |                              │ /usr/src/app/main.py:118 in predict             │
immich_machine_learning  |                              │                                                 │
immich_machine_learning  |                              │   115 │                                         │
immich_machine_learning  |                              │   116 │   model = await load(await model_cache. │
immich_machine_learning  |                              │       ttl=settings.model_ttl, **kwargs))        │
immich_machine_learning  |                              │   117 │   model.configure(**kwargs)             │
immich_machine_learning  |                              │ ❱ 118 │   outputs = await run(model.predict, in │
immich_machine_learning  |                              │   119 │   return ORJSONResponse(outputs)        │
immich_machine_learning  |                              │   120                                           │
immich_machine_learning  |                              │   121                                           │
immich_machine_learning  |                              │                                                 │
immich_machine_learning  |                              │ /usr/src/app/main.py:125 in run                 │
immich_machine_learning  |                              │                                                 │
immich_machine_learning  |                              │   122 async def run(func: Callable[..., Any], i │
immich_machine_learning  |                              │   123 │   if thread_pool is None:               │
immich_machine_learning  |                              │   124 │   │   return func(inputs)               │
immich_machine_learning  |                              │ ❱ 125 │   return await asyncio.get_running_loop │
immich_machine_learning  |                              │   126                                           │
immich_machine_learning  |                              │   127                                           │
immich_machine_learning  |                              │   128 async def load(model: InferenceModel) ->  │
immich_machine_learning  |                              │                                                 │
immich_machine_learning  |                              │ /usr/lib/python3.10/concurrent/futures/thread.p │
immich_machine_learning  |                              │ y:58 in run                                     │
immich_machine_learning  |                              │                                                 │
immich_machine_learning  |                              │ /usr/src/app/models/base.py:59 in predict       │
immich_machine_learning  |                              │                                                 │
immich_machine_learning  |                              │    56 │   │   self.load()                       │
immich_machine_learning  |                              │    57 │   │   if model_kwargs:                  │
immich_machine_learning  |                              │    58 │   │   │   self.configure(**model_kwargs │
immich_machine_learning  |                              │ ❱  59 │   │   return self._predict(inputs)      │
immich_machine_learning  |                              │    60 │                                         │
immich_machine_learning  |                              │    61 │   @abstractmethod                       │
immich_machine_learning  |                              │    62 │   def _predict(self, inputs: Any) -> An │
immich_machine_learning  |                              │                                                 │
immich_machine_learning  |                              │ /usr/src/app/models/facial_recognition.py:49 in │
immich_machine_learning  |                              │ _predict                                        │
immich_machine_learning  |                              │                                                 │
immich_machine_learning  |                              │   46 │   │   else:                              │
immich_machine_learning  |                              │   47 │   │   │   decoded_image = image          │
immich_machine_learning  |                              │   48 │   │   assert is_ndarray(decoded_image, n │
immich_machine_learning  |                              │ ❱ 49 │   │   bboxes, kpss = self.det_model.dete │
immich_machine_learning  |                              │   50 │   │   if bboxes.size == 0:               │
immich_machine_learning  |                              │   51 │   │   │   return []                      │
immich_machine_learning  |                              │   52 │   │   assert is_ndarray(kpss, np.float32 │
immich_machine_learning  |                              │                                                 │
immich_machine_learning  |                              │ /opt/venv/lib/python3.10/site-packages/insightf │
immich_machine_learning  |                              │ ace/model_zoo/retinaface.py:224 in detect       │
immich_machine_learning  |                              │                                                 │
immich_machine_learning  |                              │   221 │   │   det_img = np.zeros( (input_size[1 │
immich_machine_learning  |                              │   222 │   │   det_img[:new_height, :new_width,  │
immich_machine_learning  |                              │   223 │   │                                     │
immich_machine_learning  |                              │ ❱ 224 │   │   scores_list, bboxes_list, kpss_li │
immich_machine_learning  |                              │   225 │   │                                     │
immich_machine_learning  |                              │   226 │   │   scores = np.vstack(scores_list)   │
immich_machine_learning  |                              │   227 │   │   scores_ravel = scores.ravel()     │
immich_machine_learning  |                              │                                                 │
immich_machine_learning  |                              │ /opt/venv/lib/python3.10/site-packages/insightf │
immich_machine_learning  |                              │ ace/model_zoo/retinaface.py:152 in forward      │
immich_machine_learning  |                              │                                                 │
immich_machine_learning  |                              │   149 │   │   kpss_list = []                    │
immich_machine_learning  |                              │   150 │   │   input_size = tuple(img.shape[0:2] │
immich_machine_learning  |                              │   151 │   │   blob = cv2.dnn.blobFromImage(img, │
immich_machine_learning  |                              │       (self.input_mean, self.input_mean, self.i │
immich_machine_learning  |                              │ ❱ 152 │   │   net_outs = self.session.run(self. │
immich_machine_learning  |                              │   153 │   │                                     │
immich_machine_learning  |                              │   154 │   │   input_height = blob.shape[2]      │
immich_machine_learning  |                              │   155 │   │   input_width = blob.shape[3]       │
immich_machine_learning  |                              │                                                 │
immich_machine_learning  |                              │ /opt/venv/lib/python3.10/site-packages/onnxrunt │
immich_machine_learning  |                              │ ime/capi/onnxruntime_inference_collection.py:22 │
immich_machine_learning  |                              │ 0 in run                                        │
immich_machine_learning  |                              │                                                 │
immich_machine_learning  |                              │    217 │   │   if not output_names:             │
immich_machine_learning  |                              │    218 │   │   │   output_names = [output.name  │
immich_machine_learning  |                              │    219 │   │   try:                             │
immich_machine_learning  |                              │ ❱  220 │   │   │   return self._sess.run(output │
immich_machine_learning  |                              │    221 │   │   except C.EPFail as err:          │
immich_machine_learning  |                              │    222 │   │   │   if self._enable_fallback:    │
immich_machine_learning  |                              │    223 │   │   │   │   print(f"EP Error: {err!s │
immich_machine_learning  |                              ╰─────────────────────────────────────────────────╯
immich_machine_learning  |                              RuntimeException: [ONNXRuntimeError] : 6 :         
immich_machine_learning  |                              RUNTIME_EXCEPTION : Encountered unknown exception  
immich_machine_learning  |                              in Run()                                           
immich_microservices     | [Nest] 7  - 03/23/2024, 10:46:08 PM   ERROR [JobService] Unable to run job handler (faceDetection/face-detection): Error: Machine learning request for facial recognition failed with status 500: Internal Server Error
immich_microservices     | [Nest] 7  - 03/23/2024, 10:46:08 PM   ERROR [JobService] Error: Machine learning request for facial recognition failed with status 500: Internal Server Error
immich_microservices     |     at MachineLearningRepository.predict (/usr/src/app/dist/infra/repositories/machine-learning.repository.js:23:19)
immich_microservices     |     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
immich_microservices     |     at async PersonService.handleDetectFaces (/usr/src/app/dist/domain/person/person.service.js:248:23)
immich_microservices     |     at async /usr/src/app/dist/domain/job/job.service.js:137:36
immich_microservices     |     at async Worker.processJob (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:394:28)
immich_microservices     |     at async Worker.retryIfFailed (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:581:24)
immich_microservices     | [Nest] 7  - 03/23/2024, 10:46:08 PM   ERROR [JobService] Object:
immich_microservices     | {
immich_microservices     |   "id": "9e1d4bbf-84c2-40dd-9aec-c913e5a1a662"
immich_microservices     | }
immich_microservices     | 

Regular Smart Search proceeds without issue.

The OS that Immich Server is running on

Proxmox 8.1 (6.5 Linux Kernel)

Version of Immich Server

1.99.0

Version of Immich Mobile App

1.99.0

Platform with the issue

Your docker-compose.yml content

#
# WARNING: Make sure to use the docker-compose.yml of the current release:
#
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
#
# The compose file on main may not be compatible with the latest release.
#

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    command: [ "start.sh", "immich" ]
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /mount/op11:/usr/src/app/external
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    ports:
      - 2283:3001
    depends_on:
      - redis
      - database
    restart: always

  immich-microservices:
    container_name: immich_microservices
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    extends: 
     file: hwaccel.transcoding.yml 
     service: quicksync # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding
    command: [ "start.sh", "microservices" ]
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /mount/op11:/usr/src/app/external
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    depends_on:
      - redis
      - database
    restart: always

  immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:main-openvino
    extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
      file: hwaccel.ml.yml
      service: openvino # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always

  redis:
    container_name: immich_redis
    image: registry.hub.docker.com/library/redis:6.2-alpine@sha256:51d6c56749a4243096327e3fb964a48ed92254357108449cb6e23999c37773c5
    restart: always

  database:
    container_name: immich_postgres
    image: registry.hub.docker.com/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
    volumes:
      - pgdata:/var/lib/postgresql/data
    restart: always

volumes:
  pgdata:
  model-cache:

Your .env content

# You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables

# The location where your uploaded files are stored
UPLOAD_LOCATION=./library
# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release
# Connection secret for postgres. You should change it to a random password
DB_PASSWORD=(redacted)

# The values below this line do not need to be changed
###################################################################################
DB_HOSTNAME=immich_postgres
DB_USERNAME=postgres
DB_DATABASE_NAME=immich

REDIS_HOSTNAME=immich_redis

Reproduction steps

1. docker compose up
2. --> Jobs 
3. --> Click Face Detection: All
...

Additional information

My processor is an Intel N100. Previously to 1.99.0, the face detection was working, but I had issues with the smart search, so I guess it's hard to get all of it with OpenVino haha.

Thank you for the gigantic work up to now! Leo

shummo commented 6 months ago

HI! I run immich on PROXMOX > Debain 12 VM > DOCKER. I use the ghcr.io/snuupy/immich-machine-learning:v1.105.1-openvino image which is working!!! BUT my system got crash randomly, and i can't get the real problem.

Proxmox write internal ERROR! so have to reset VM, but after it there is no LOG before reboot.

If i change back the official image, the random crashes are gone. So the problem must be with this image(i think so), but don't know what could be the problem.

memesalot commented 6 months ago

Same issue here.

Fresh install of 24.04 LTS


                             ['OpenVINOExecutionProvider',                      
                             'CPUExecutionProvider'], in descending order of    
                             preference                                         
[05/27/24 23:00:25] INFO     Loading facial recognition model 'buffalo_l' to    
                             memory                                             
[05/27/24 23:00:42] ERROR    Exception in ASGI application                      

                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /usr/src/app/main.py:118 in predict             │
                             │                                                 │
                             │   115 │                                         │
                             │   116 │   model = await load(await model_cache. │
                             │       ttl=settings.model_ttl, **kwargs))        │
                             │   117 │   model.configure(**kwargs)             │
                             │ ❱ 118 │   outputs = await run(model.predict, in │
                             │   119 │   return ORJSONResponse(outputs)        │
                             │   120                                           │
                             │   121                                           │
                             │                                                 │
                             │ /usr/src/app/main.py:125 in run                 │
                             │                                                 │
                             │   122 async def run(func: Callable[..., Any], i │
                             │   123 │   if thread_pool is None:               │
                             │   124 │   │   return func(inputs)               │
                             │ ❱ 125 │   return await asyncio.get_running_loop │
                             │   126                                           │
                             │   127                                           │
                             │   128 async def load(model: InferenceModel) ->  │
                             │                                                 │
                             │ /usr/lib/python3.10/concurrent/futures/thread.p │
                             │ y:58 in run                                     │
                             │                                                 │
                             │ /usr/src/app/models/base.py:59 in predict       │
                             │                                                 │
                             │    56 │   │   self.load()                       │
                             │    57 │   │   if model_kwargs:                  │
                             │    58 │   │   │   self.configure(**model_kwargs │
                             │ ❱  59 │   │   return self._predict(inputs)      │
                             │    60 │                                         │
                             │    61 │   @abstractmethod                       │
                             │    62 │   def _predict(self, inputs: Any) -> An │
                             │                                                 │
                             │ /usr/src/app/models/facial_recognition.py:49 in │
                             │ _predict                                        │
                             │                                                 │
                             │   46 │   │   else:                              │
                             │   47 │   │   │   decoded_image = image          │
                             │   48 │   │   assert is_ndarray(decoded_image, n │
                             │```
mertalev commented 6 months ago

I took another look at this.

The face detection model is what's causing the error. It has unknown shapes in its operations, so it may be that there's a mistake during compilation that's only realized during inference. By contrast, the facial recognition and smart search models have fully known shapes for all operations (excluding the dynamic axis).

Another thing I noticed is that the smart search models (at least the default one) are broken up into a bunch of subgraphs. This adds a ton of overhead to inference since it has to repeatedly transfer data to the GPU. I imagine this is what's causing it to be slower than expected.

``` immich_machine_learning | [05/29/24 01:43:38] DEBUG Loading clip text model 'ViT-B-32__openai' immich_machine_learning | In the OpenVINO EP immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | [05/29/24 01:43:41] DEBUG Loaded clip text model 'ViT-B-32__openai' immich_machine_learning | [05/29/24 01:43:41] DEBUG Loading clip vision model 'ViT-B-32__openai' immich_machine_learning | In the OpenVINO EP immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | CreateNgraphFunc immich_machine_learning | [05/29/24 01:43:45] DEBUG Loaded clip vision model 'ViT-B-32__openai' immich_machine_learning | [05/29/24 01:43:45] DEBUG Loading tokenizer for CLIP model 'ViT-B-32__openai' immich_machine_learning | [05/29/24 01:43:45] DEBUG Loading model config for CLIP model immich_machine_learning | 'ViT-B-32__openai' immich_machine_learning | [05/29/24 01:43:45] DEBUG Loaded model config for CLIP model immich_machine_learning | 'ViT-B-32__openai' immich_machine_learning | [05/29/24 01:43:45] DEBUG Loading tokenizer config for CLIP model immich_machine_learning | 'ViT-B-32__openai' immich_machine_learning | [05/29/24 01:43:45] DEBUG Loaded tokenizer config for CLIP model immich_machine_learning | 'ViT-B-32__openai' immich_machine_learning | [05/29/24 01:43:45] DEBUG Loaded tokenizer for CLIP model 'ViT-B-32__openai' immich_machine_learning | [05/29/24 01:43:45] DEBUG Loading visual preprocessing config for CLIP model immich_machine_learning | 'ViT-B-32__openai' immich_machine_learning | [05/29/24 01:43:45] DEBUG Loaded visual preprocessing config for CLIP model immich_machine_learning | 'ViT-B-32__openai' immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | [05/29/24 01:43:46] DEBUG Checking for inactivity... immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful immich_machine_learning | Inference successful ```

The face detection and recognition models only have one log each for CreateNgraphFunc and Inference successful, meaning the entire inference is done in the same graph. They're considerably faster and show higher GPU usage compared to the smart search model despite normally being slower.

I don't know what operation in the CLIP model it struggles with yet, but addressing that should give a big performance boost.

krishne35 commented 5 months ago

@Snuupy can you push the new version of the image 1.105.1 is not compatible with the new 1.106.3 it throwing api error

Snuupy commented 5 months ago

@Snuupy can you push the new version of the image 1.105.1 is not compatible with the new 1.106.3 it throwing api error

@krishne35

I've been waiting for the hotfixes to land before doing a manual build, I'll give it some breathing room first

if you want to do a build right now you can follow https://github.com/immich-app/immich/issues/8226#issuecomment-2111024809

krishne35 commented 5 months ago

@Snuupy can you push the new version of the image 1.105.1 is not compatible with the new 1.106.3 it throwing api error

@krishne35

I've been waiting for the hotfixes to land before doing a manual build, I'll give it some breathing room first

if you want to do a build right now you can follow #8226 (comment) Thank you will keep an eye out Will give it a try

tanyewei commented 5 months ago

Want to use my image?

In your docker-compose.yml, replace

image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-openvino with image: ghcr.io/snuupy/immich-machine-learning:v1.105.0-openvino

https://github.com/Snuupy/immich/pkgs/container/immich-machine-learning

Want to build your own image? Here's how I did it:

install poetry, change onnxruntime-openvino to target 1.15 or 1.16 in the pyproject.toml file, then run poetry lock --no-update

[tool.poetry.group.openvino.dependencies]
# onnxruntime-openvino = "^1.17.1"
onnxruntime-openvino = ">=1.15.0,<1.16.0"

In the machine-learning/Dockerfile, change the line:

FROM openvino/ubuntu22_runtime:2023.3.0@sha256:176646df619032ea6c10faf842867119c393e7497b7f88b5e307e932a0fd5aa8 as builder-openvino (v1.105.0)

to FROM openvino/ubuntu22_runtime:2023.1.0@sha256:002842a9005ba01543b7169ff6f14ecbec82287f09c4d1dd37717f0a8e8754a7 as builder-openvino (v1.98.2)

then do a docker build --build-arg="DEVICE=openvino" -t NAMESPACE/immich-machine-learning:v1.105.0-openvino .

My processor is N100. Following your method, it is now working properly. Thank you for sharing."

krishne35 commented 5 months ago

Want to use my image?

In your docker-compose.yml, replace

image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-openvino with image: ghcr.io/snuupy/immich-machine-learning:v1.105.0-openvino

https://github.com/Snuupy/immich/pkgs/container/immich-machine-learning

Want to build your own image? Here's how I did it:

install poetry, change onnxruntime-openvino to target 1.15 or 1.16 in the pyproject.toml file, then run poetry lock --no-update

[tool.poetry.group.openvino.dependencies]
# onnxruntime-openvino = "^1.17.1"
onnxruntime-openvino = ">=1.15.0,<1.16.0"

In the machine-learning/Dockerfile, change the line:

FROM openvino/ubuntu22_runtime:2023.3.0@sha256:176646df619032ea6c10faf842867119c393e7497b7f88b5e307e932a0fd5aa8 as builder-openvino (v1.105.0)

to FROM openvino/ubuntu22_runtime:2023.1.0@sha256:002842a9005ba01543b7169ff6f14ecbec82287f09c4d1dd37717f0a8e8754a7 as builder-openvino (v1.98.2)

then do a docker build --build-arg="DEVICE=openvino" -t NAMESPACE/immich-machine-learning:v1.105.0-openvino .

My processor is N100. Following your method, it is now working properly. Thank you for sharing."

Quick question What model are you using and what's the ram usage? Coz when I try to use ViT-L-14-quickgelu__dfn2b it eats up my entire 16GB ram But when I use cpu it uses barely around 4GB (with other services running) @Snuupy any tip

jsapede commented 5 months ago

switching to 1.106.4 completely broke openvino once again :'( even using snuppy image. hope there will be update soon [edit]: seems to work after rebuilding snuupy image according to instructions

krishne35 commented 5 months ago

switching to 1.106.4 completely broke openvino once again :'( even using snuppy image. hope there will be update soon [edit]: seems to work after rebuilding snuupy image according to instructions

Try my latest build based on @Snuupy edits ghcr.io/krishne35/immich-machine-learning:main-openvino

Snuupy commented 5 months ago

Hi all, this is untested (as I'm currently making the changes necessary for my immich install to update 😄)

ghcr.io/snuupy/immich-machine-learning:v1.106.4-openvino

I forgot to mention, there are 2 places to make the Dockerfile image change:

  1. https://github.com/immich-app/immich/blob/v1.98.2/machine-learning/Dockerfile#L5
  2. https://github.com/immich-app/immich/blob/v1.98.2/machine-learning/Dockerfile#L39

I bet that's why smart search was broken on the 1.105.0 image (I changed it for the 1.105.1 image and the latest one as well)

jsapede commented 5 months ago

seems to work

engels0n commented 5 months ago

Seems to work for me 2 on an i5 7500T.

Are these entries in the .env still necessary?

NEOReadDebugKeys=1
OverrideGpuAddressSpace=48

// EDIT: Does not seem to be necessary anymore! Yeah, nice @Snuupy ! Thank you very much ! :)

Snuupy commented 5 months ago

actually I don't think mine is working, seems like smart search broken. I forced python 3.12 and the project default requires python < 3.12, idk why

I'm installing Python-3.11.9 now to test another build

edit: https://docs.openvino.ai/archive/2023.1/system_requirements.html because this says 3.11 lol

no wonder 3.12 is broken

please wait while my pyenv installs 3.11.9 😴

mertalev commented 5 months ago

Seems to work for me 2 on an i5 7500T.

Are these entries in the .env still necessary?


NEOReadDebugKeys=1

OverrideGpuAddressSpace=48

// EDIT: Does not seem to be necessary anymore! Yeah, nice @Snuupy ! Thank you very much ! :)

It's still needed for kernels 6.7.5 or newer.

Snuupy commented 5 months ago

It looks like v1.106 changed something where openvino doesn't work for low powered intel cpus even if you patch the openvino versions to use 2023.1. I'm not able to get smart search working after waiting several hours for the (1) cpu core pegged at 100%.

I have reverted to the cpu only version for now.

jsapede commented 5 months ago

Seems to work for me 2 on an i5 7500T. Are these entries in the .env still necessary?


NEOReadDebugKeys=1

OverrideGpuAddressSpace=48

// EDIT: Does not seem to be necessary anymore! Yeah, nice @Snuupy ! Thank you very much ! :)

It's still needed for kernels 6.7.5 or newer.

My install is on proxmox 8.2.2 with PVE kernel 6.8.4-3 at this time i haven't set these flags in my immich docker composes but it seems to worrk without problems with the snuupy update for openvino :

NEOReadDebugKeys=1 OverrideGpuAddressSpace=48

strangely i have to set these flags in frigate compose to let it work with openvino.

XiaoranQingxue commented 5 months ago

switching to 1.106.4 completely broke openvino once again :'( even using snuppy image. hope there will be update soon [edit]: seems to work after rebuilding snuupy image according to instructions

Try my latest build based on @Snuupy edits ghcr.io/krishne35/immich-machine-learning:main-openvino

I use this image and can use OpenVINO (UHD730) normally, but when searching, I get an error

2024-06-21 01:35:59.912150839 [E:onnxruntime:, sequential_executor.cc:514 Execute Kernel] Non-zero status code returned while running OpenVINO-EP-subgraph_1 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_1_0' Status Message: /home/o nnxruntimedev/onnxruntime/onnxruntime/core/providers/openvino/ov_interface.cc:53 onnxruntime::openvino_ep::OVExeNetwork onnxruntime::openvino_ep::OVCore::LoadNetw ork(const string&, std::string&, ov::AnyMap&, std::string) [OpenVINO-EP] Excepti on while Loading Network for graph: OpenVINOExecutionProvider_OpenVINO-EP-subgrap h_1_0Check 'false' failed at src/inference/src/core.cpp:149:
invalid external data: ExternalDataInfo(data_full_path: LinearTransformation.weig ht, offset: 0, data_length: 0)

Snuupy commented 5 months ago

yes it is broken right now, will have to see if openvino 1.18 fixes this

mertalev commented 5 months ago

The 1.18.0 release of onnxruntime-openvino just came out, so it should be interesting to see if upgrading to that helps.

rui-nar commented 5 months ago

how can I help testing 1.18.0 to see if it solves my problems @mertalev ?

Snuupy commented 4 months ago

simply upgrading to openvino 1.18 does not solve the issue

I made the changes here https://github.com/Snuupy/immich/commit/69b2aae791e8cfe20287f6fe4b687de4235702fe and built an image here https://github.com/Snuupy/immich/pkgs/container/immich-machine-learning/240363346?tag=v1.107.2-openvino

Errors out unfortunately:

immich_machine_learning  | [07/08/24 04:54:21] INFO     Attempt #2 to load detection model 'buffalo_l' to  
immich_machine_learning  |                              memory                                             
immich_machine_learning  | [07/08/24 04:54:21] INFO     Setting execution providers to                     
immich_machine_learning  |                              ['OpenVINOExecutionProvider',                      
immich_machine_learning  |                              'CPUExecutionProvider'], in descending order of    
immich_machine_learning  |                              preference                                         
immich_machine_learning  | 2024-07-08 04:54:21.759958499 [W:onnxruntime:Default, openvino_provider_factory.cc:111 CreateExecutionProviderFactory] [OpenVINO] Selected 'device_type' GPU_FP32 is deprecated. 
immich_machine_learning  | Update the 'device_type' to specified types 'CPU', 'GPU', 'GPU.0', 'GPU.1', 'NPU' or from HETERO/MULTI/AUTO options and set 'precision' separately. 
immich_machine_learning  | 
immich_machine_learning  | In the OpenVINO EP
immich_machine_learning  | Model is fully supported on OpenVINO
immich_machine_learning  | CreateNgraphFunc
immich_server            | [Nest] 7  - 07/08/2024, 4:59:22 AM   ERROR [Microservices:JobService] Unable to run job handler (faceDetection/face-detection): Error: Machine learning request to "http://immich-machine-learning:3003" failed with HeadersTimeoutError: Headers Timeout Error
immich_server            | [Nest] 7  - 07/08/2024, 4:59:22 AM   ERROR [Microservices:JobService] Error: Machine learning request to "http://immich-machine-learning:3003" failed with HeadersTimeoutError: Headers Timeout Error
immich_server            |     at /usr/src/app/dist/repositories/machine-learning.repository.js:19:19
immich_server            |     at async MachineLearningRepository.predict (/usr/src/app/dist/repositories/machine-learning.repository.js:18:21)
immich_server            |     at async MachineLearningRepository.detectFaces (/usr/src/app/dist/repositories/machine-learning.repository.js:33:26)
immich_server            |     at async PersonService.handleDetectFaces (/usr/src/app/dist/services/person.service.js:275:52)
immich_server            |     at async /usr/src/app/dist/services/job.service.js:148:36
immich_server            |     at async Worker.processJob (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:394:28)
immich_server            |     at async Worker.retryIfFailed (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:581:24)
immich_server            | [Nest] 7  - 07/08/2024, 4:59:22 AM   ERROR [Microservices:JobService] Object:
immich_server            | {
immich_server            |   "id": "e119eadf-9afa-48d9-8b96-ba139ba85694"
immich_server            | }
immich_server            | 
mertalev commented 4 months ago

There's no ML error there: it's just compiling the model.

Snuupy commented 4 months ago

I think you're right, I still have 1 core pegged to 100%. Let me give it some time and see what happens.

I have a previous MACHINE_LEARNING_WORKER_TIMEOUT=6000 and right now it says in htop TIME 14:51.88

Snuupy commented 4 months ago

@mertalev I can count this as not working now, right? image

mertalev commented 4 months ago

You had this issue with it getting stuck during compilation before the upgrade too, right? I wonder if the temporary envs I added to the image are causing an issue for your environment. If your kernel is below 6.7.5, try removing those envs.

Snuupy commented 4 months ago

Yes.

root@snuminipc:~/immich# uname -r
6.8.0-36-generic

for me:

105: working 105.1: working 106.4: not working

let me look at what commits were between 105.1 and 106.4...

I'll try again with env vars removed, maybe new onnxruntime -> no need for env vars

Edit: I removed the env vars and it stopped compiling forever, I get these in logs and smart search is broken now.

immich_machine_learning  | [07/08/24 17:45:10] INFO     Downloading visual model 'ViT-B-32__openai'. This  
immich_machine_learning  |                              may take a while.                                  
immich_machine_learning  | /opt/venv/lib/python3.10/site-packages/huggingface_hub/file_download.py:1194: UserWarning: `local_dir_use_symlinks` parameter is deprecated and will be ignored. The process to download files to a local folder has been updated and do not rely on symlinks anymore. You only need to pass a destination folder as`local_dir`.
immich_machine_learning  | For more details, check out https://huggingface.co/docs/huggingface_hub/main/en/guides/download#download-files-to-local-folder.
immich_machine_learning  |   warnings.warn(
immich_server            | [Nest] 7  - 07/08/2024, 5:45:15 PM     LOG [Microservices:MediaService] Successfully generated JPEG image preview for asset 0fceb6e7-1281-451d-8553-6fccadddbca4
immich_server            | [Nest] 7  - 07/08/2024, 5:45:15 PM     LOG [Microservices:MediaService] Successfully generated WEBP image thumbnail for asset 0fceb6e7-1281-451d-8553-6fccadddbca4
Fetching 11 files: 100%|██████████| 11/11 [00:16<00:00,  1.46s/it]
immich_machine_learning  | [07/08/24 17:45:27] INFO     Attempt #2 to load visual model 'ViT-B-32__openai' 
immich_machine_learning  |                              to memory                                          
immich_machine_learning  | [07/08/24 17:45:27] INFO     Setting execution providers to                     
immich_machine_learning  |                              ['OpenVINOExecutionProvider',                      
immich_machine_learning  |                              'CPUExecutionProvider'], in descending order of    
immich_machine_learning  |                              preference                                         
immich_machine_learning  | 2024-07-08 17:45:27.463828333 [W:onnxruntime:Default, openvino_provider_factory.cc:111 CreateExecutionProviderFactory] [OpenVINO] Selected 'device_type' GPU_FP32 is deprecated. 
immich_machine_learning  | Update the 'device_type' to specified types 'CPU', 'GPU', 'GPU.0', 'GPU.1', 'NPU' or from HETERO/MULTI/AUTO options and set 'precision' separately. 
immich_machine_learning  | 
immich_machine_learning  | In the OpenVINO EP
immich_machine_learning  | Model is fully supported on OpenVINO
immich_machine_learning  | CreateNgraphFunc
immich_server            | [Nest] 17  - 07/08/2024, 5:46:47 PM     LOG [Api:EventRepository] Websocket Disconnect: 2MKFnsNyU-iL5_axAAAJ
immich_server            | [Nest] 17  - 07/08/2024, 5:46:49 PM     LOG [Api:EventRepository] Websocket Connect:    DbsJUz6vJ4IjyG_qAAAL
immich_machine_learning  | [07/08/24 17:47:06] INFO     Attempt #2 to load detection model 'buffalo_l' to  
immich_machine_learning  |                              memory                                             
immich_machine_learning  | [07/08/24 17:47:06] INFO     Setting execution providers to                     
immich_machine_learning  |                              ['OpenVINOExecutionProvider',                      
immich_machine_learning  |                              'CPUExecutionProvider'], in descending order of    
immich_machine_learning  |                              preference                                         
immich_machine_learning  | 2024-07-08 17:47:06.203039789 [W:onnxruntime:Default, openvino_provider_factory.cc:111 CreateExecutionProviderFactory] [OpenVINO] Selected 'device_type' GPU_FP32 is deprecated. 
immich_machine_learning  | Update the 'device_type' to specified types 'CPU', 'GPU', 'GPU.0', 'GPU.1', 'NPU' or from HETERO/MULTI/AUTO options and set 'precision' separately. 
immich_machine_learning  | 
immich_machine_learning  | In the OpenVINO EP
immich_machine_learning  | Model is fully supported on OpenVINO
immich_machine_learning  | CreateNgraphFunc
immich_machine_learning  | Inference successful
immich_machine_learning  | Inference successful
immich_machine_learning  | Inference successful
immich_machine_learning  | [07/08/24 17:47:27] INFO     Attempt #2 to load recognition model 'buffalo_l' to
immich_machine_learning  |                              memory                                             
immich_machine_learning  | [07/08/24 17:47:27] INFO     Setting execution providers to                     
immich_machine_learning  |                              ['OpenVINOExecutionProvider',                      
immich_machine_learning  |                              'CPUExecutionProvider'], in descending order of    
immich_machine_learning  |                              preference                                         
immich_machine_learning  | 2024-07-08 17:47:28.037891755 [W:onnxruntime:Default, openvino_provider_factory.cc:111 CreateExecutionProviderFactory] [OpenVINO] Selected 'device_type' GPU_FP32 is deprecated. 
immich_machine_learning  | Update the 'device_type' to specified types 'CPU', 'GPU', 'GPU.0', 'GPU.1', 'NPU' or from HETERO/MULTI/AUTO options and set 'precision' separately. 
immich_machine_learning  | 
immich_machine_learning  | In the OpenVINO EP
immich_machine_learning  | Model is fully supported on OpenVINO
immich_machine_learning  | CreateNgraphFunc
immich_machine_learning  | mimalloc: warning: mi_usable_size: pointer might not point to a valid heap region: 0x7cd51c020080
immich_machine_learning  | (this may still be a valid very large allocation (over 64MiB))
immich_machine_learning  | mimalloc: warning: (yes, the previous pointer 0x7cd51c020080 was valid after all)
immich_machine_learning  | mimalloc: warning: mi_usable_size: pointer might not point to a valid heap region: 0x7cd51c020080
immich_machine_learning  | (this may still be a valid very large allocation (over 64MiB))
immich_machine_learning  | mimalloc: warning: (yes, the previous pointer 0x7cd51c020080 was valid after all)
immich_machine_learning  | mimalloc: warning: mi_usable_size: pointer might not point to a valid heap region: 0x7cd51c030080
immich_machine_learning  | (this may still be a valid very large allocation (over 64MiB))
immich_machine_learning  | mimalloc: warning: (yes, the previous pointer 0x7cd51c030080 was valid after all)
immich_machine_learning  | mimalloc: warning: mi_usable_size: pointer might not point to a valid heap region: 0x7cd51c030080
immich_machine_learning  | (this may still be a valid very large allocation (over 64MiB))
immich_machine_learning  | mimalloc: warning: (yes, the previous pointer 0x7cd51c030080 was valid after all)
immich_machine_learning  | mimalloc: warning: mi_free: pointer might not point to a valid heap region: 0x7cd51c020080
immich_machine_learning  | (this may still be a valid very large allocation (over 64MiB))
immich_machine_learning  | mimalloc: warning: (yes, the previous pointer 0x7cd51c020080 was valid after all)
immich_machine_learning  | mimalloc: warning: mi_free: pointer might not point to a valid heap region: 0x7cd51c030080
immich_machine_learning  | (this may still be a valid very large allocation (over 64MiB))
immich_machine_learning  | mimalloc: warning: (yes, the previous pointer 0x7cd51c030080 was valid after all)
immich_machine_learning  | mimalloc: warning: mi_usable_size: pointer might not point to a valid heap region: 0x7cd520020080
immich_machine_learning  | (this may still be a valid very large allocation (over 64MiB))
immich_machine_learning  | mimalloc: warning: (yes, the previous pointer 0x7cd520020080 was valid after all)
immich_machine_learning  | mimalloc: warning: mi_usable_size: pointer might not point to a valid heap region: 0x7cd520020080
immich_machine_learning  | (this may still be a valid very large allocation (over 64MiB))
immich_machine_learning  | mimalloc: warning: (yes, the previous pointer 0x7cd520020080 was valid after all)
immich_machine_learning  | mimalloc: warning: mi_usable_size: pointer might not point to a valid heap region: 0x7cd520030080
immich_machine_learning  | (this may still be a valid very large allocation (over 64MiB))
immich_machine_learning  | Inference successful
immich_machine_learning  | Inference successful

Edit 2: I re-ran the facial recognition (a 2nd? 3rd? time) and now the face clusters show up under Explore but smart search like searching for a "car" still is broken, I get nonsense results (I used to get pictures/videos of a car), let me try to clear model cache and see if that fixes anything...

Edit 3: nope, smart search still broken. Facial detection works. Any other ideas to try?

mertalev commented 4 months ago

Interesting that the env vars were causing that issue for you. Two ideas:

  1. Add ARG DEVICE under FROM prod-${DEVICE} as prod. I think mimalloc is still getting preloaded because the DEVICE env ends up being empty instead of openvino.
  2. Change the Dockerfile to not install mimalloc at all. (Try if 1 does not work).
Snuupy commented 4 months ago

now it says:

immich_server            | [Nest] 17  - 07/10/2024, 7:15:00 PM     LOG [Api:EventRepository] Websocket Connect:    XV43qzMRiDTAsVrCAAAF
immich_machine_learning  | Inference successful

but this is not true, the smart search results are non-sensical

it returns results, but they are bogus (and the same) results each time

mertalev commented 4 months ago

but this is not true, the smart search results are non-sensical

It's true, it's just that the compiled OpenVINO model probably has a mistake that makes it produce different outputs.

Was smart search working well in 105.1?

Does face detection behave any differently with mimalloc removed?

Snuupy commented 4 months ago

It's true, it's just that the compiled OpenVINO model probably has a mistake that makes it produce different outputs.

Ah. Is there a way for us to fix it?

Was smart search working well in 105.1?

Yes. When I switch back to cpu I get correct results. What is the fix for this?

Does face detection behave any differently with mimalloc removed?

I only did 1) (setting ARG DEVICE), haven't really tested it actually

mertalev commented 4 months ago

Ah. Is there a way for us to fix it?

There are a few things we can try, but it's ultimately up to OpenVINO. I have some modified models with shape inference and static dimensions lying around that could make it easier for it OpenVINO to know how the model should work. Let me see if I can link one for you to try.

kaivol commented 4 months ago

Wanted to chime to confirm that the changes by @Snuupy fixed OpenVINO for me, thank you very much! (I built the image myself because the latest published image doesn't seem to be up to date (?))

mertalev commented 4 months ago

Oh, that's really interesting! Could you share:

Snuupy commented 4 months ago

Wanted to chime to confirm that the changes by @Snuupy fixed OpenVINO for me, thank you very much! (I built the image myself because the latest published image doesn't seem to be up to date (?))

sorry, was testing a bunch of changes so the code and docker image are out of sync (I went as far as trying to revert back to a specific commit and applying all my changes on top again, but I didn't realize I also had to build the server image and ran out of time on my end).

can you please test smart search? that was still broken for me. try to search for a car or something.

kaivol commented 4 months ago
* Do both smart search and face detection work?
* Are the results good / the same as CPU?

Update: Without OpenVINO, significantly more faces are found. Also Smart Search seems to work.

Smart search doesn't really seem to work: Queries like 'beach' or 'grass' return plausible result, but more specific queries (e.g. 'desk') do not. But I'm also not sure what level of quality I can expect here.

~Face detection results look reasonably good, though some obvious images are missing.~

~I will try without OpenVINO and report whether the results are different.~

I'm using the XLM-Roberta-Large-Vit-B-16Plus and buffalo_l images, fyi.

* What processor do you have?

Intel Pentium G4600

  • Was only face detection failing for you before, or was smart search also failing? I'm not sure about that, I tried different versions of Immich with and without OpenVINO acceleration every now and then.
  • What's the version of your kernel? 5.15.0-112-generic I'm running Immich via podman, in case that makes a difference.

Also, I noticed that pressing the MISSING button in the Smart Search category always reruns recognition on all (or at least most of my) images, is this expected behavior or an error in my setup?

And a last remark: I also got errors as reported here, not sure if this is relevant.

Snuupy commented 4 months ago

Smart search doesn't really seem to work: Queries like 'beach' or 'grass' return plausible result, but more specific queries (e.g. 'desk') do not. But I'm also not sure what level of quality I can expect here.

yeah smart search is broken, it's quite obvious when it's working

searching for car gets you a car, receipt gets you receipts, a sign gets you a sign, etc.

kaivol commented 4 months ago

Just FYI, face detection works works for me with onnxruntime-openvino = "^1.18.0" and openvino/ubuntu22_runtime:2024.1 (see here).

mertalev commented 4 months ago

I upgraded main to use the latest onnxruntime-openvino and made some changes to how the OpenVINO image is built. Testing with a 155H processor, I can confirm face detection returns the right results and performs well (roughly 3x faster than CPU in my case).

Search produces very wrong model outputs as mentioned by others. After trying a few things, changing the models to have static dimensions gives the expected results. I'll upload revised models to address this for now and let y'all know when they're up.

djjudas21 commented 4 months ago

As someone who is an experienced kubernetes engineer but who knows absolutely nothing about AI/ML models, I just wanted to say thanks all for your continued knowledge and effort on this issue. I appreciate it ☺️👍

mertalev commented 4 months ago

New search models are up! You can delete your model cache volume to make it download the updated models. This only fixes smart search being inaccurate, though - face detection still needs an image with the above changes (or main) until the next release. Let me know if you still have issues!

w00tlarr commented 4 months ago

New search models are up! You can delete your model cache volume to make it download the updated models. This only fixes smart search being inaccurate, though - face detection still needs an image with the above changes (or main) until the next release. Let me know if you still have issues!

Awesome news!! Much appreciated and a long time coming. Is there a PR image that I can use to test this out?

mertalev commented 4 months ago

This image is pinned so it won't update with main: ghcr.io/immich-app/immich-machine-learning:main-openvino@sha256:fb55668b598823f3101174ae3f7e6a1911a894523cc93f69fbafe86a175ebea4

XiaoranQingxue commented 4 months ago

After I updated the latest "CLIP" model (I used "XLM-Roberta-Large-Vit-B-16Plus"), the recognition was twice as slow,Using a "UHD730", Is this an inevitable problem?

krishne35 commented 4 months ago

After I updated the latest "CLIP" model (I used "XLM-Roberta-Large-Vit-B-16Plus"), the recognition was twice as slow,Using a "UHD730", Is this an inevitable problem?

+1 But I'm using i7 6700T

mertalev commented 4 months ago

After I updated the latest "CLIP" model (I used "XLM-Roberta-Large-Vit-B-16Plus"), the recognition was twice as slow,Using a "UHD730", Is this an inevitable problem?

The performance will depend heavily on how good the processor's iGPU is relative to the CPU. Can you confirm that the results are correct / same as CPU?

mertalev commented 4 months ago

Can you clarify if you mean searching is slower than on CPU, or if processing jobs is slower? I have an idea if it's the latter.

XiaoranQingxue commented 4 months ago

I'm sorry. Maybe I wasn't clear enough.

Using "postman", using the same image, test "visual" and "textual" many times, averaging the response time

old model - visual
commit:3ec6422
latest model - visual
commit:02aad7a
old model - textual latest model - textual
CPU(12300T) 325ms 330ms 240ms 240ms
iGPU(UHD730) 270ms 607ms 190ms 190ms

Here are the test results in my current environment, and 607ms is "obtrusive"

XiaoranQingxue commented 4 months ago

After I updated the latest "CLIP" model (I used "XLM-Roberta-Large-Vit-B-16Plus"), the recognition was twice as slow,Using a "UHD730", Is this an inevitable problem?

The performance will depend heavily on how good the processor's iGPU is relative to the CPU. Can you confirm that the results are correct / same as CPU?

I only tested "car" and "车", and both models returned the same results using CPU and iGPU (at least from the retrieved photos).

mertalev commented 4 months ago

Thanks, that chart is very helpful. It does seem like the XLM visual model is worse than before. I'm curious how many models this affects; the default is fine, at least. It might be specific to M-CLIP models like this one, or it could possibly affect larger visual models in general.

ViT-B-32__openai

old model+image - visual new model+image - visual old model+image - textual new model+image - textual
CPU (155H) 70ms 70ms 40ms 40ms
iGPU (Arc) 85ms 35ms 55ms 15ms

XLM-Roberta-Large-Vit-B-16Plus

old model+image - visual new model+image - visual old model+image - textual new model+image - textual
CPU (155H) 270ms 280ms 230ms 230ms
iGPU (Arc) 165ms 375ms 135ms 55ms