immich-app / immich

High performance self-hosted photo and video management solution.
https://immich.app
GNU Affero General Public License v3.0
46.26k stars 2.29k forks source link

Getting error in ASGI failed to allocate memory #12930

Open rayzorben opened 3 days ago

rayzorben commented 3 days ago

The bug

I am just looking at my logs because of an issue I am having with facial recognition, these errors are unrelated as they happened during the night, but I wanted to draw some attention to them in case it is an issue.

I see the memory error, however, this machine is only currently using 3GB of the 8GB allocated to it.

The OS that Immich Server is running on

proxmox

Version of Immich Server

v1.115.0

Version of Immich Mobile App

NA

Platform with the issue

Your docker-compose.yml content

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    extends:
      file: hwaccel.transcoding.yml
      service: quicksync # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding
    volumes:
      # Do not edit the next line. If you want to change the media storage location on your system, edit the value of UPLOAD_LOCATION in the .env file
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
      - /storage/dropbox:/dropbox
    env_file:
      - .env
    ports:
      - 2283:3001
    depends_on:
      - redis
      - database
    restart: unless-stopped

  immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-cuda
    extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
      file: hwaccel.ml.yml
      service: cuda # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: unless-stopped

  redis:
    container_name: immich_redis
    image: docker.io/redis:6.2-alpine@sha256:2d1463258f2764328496376f5d965f20c6a67f66ea2b06dc42af351f75248792
    restart: unless-stopped

  database:
    container_name: immich_postgres
    image: docker.io/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
      POSTGRES_INITDB_ARGS: '--data-checksums'
    volumes:
      # Do not edit the next line. If you want to change the database storage location on your system, edit the value of DB_DATA_LOCATION in the .env file
      - ${DB_DATA_LOCATION}:/var/lib/postgresql/data
    command: ["postgres", "-c", "shared_preload_libraries=vectors.so", "-c", 'search_path="$$user", public, vectors', "-c", "logging_collector=on", "-c", "max_wal_size=2GB", "-c", "shared_buffers=512MB", "-c", "wal_compression=on"]
    restart: unless-stopped

volumes:
  model-cache:

Your .env content

# The location where your uploaded files are stored
UPLOAD_LOCATION=/storage/dropbox/Pictures
# The location where your database files are stored
DB_DATA_LOCATION=/storage/appdata/immich

# To set a timezone, uncomment the next line and change Etc/UTC to a TZ identifier from this list: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List
TZ=America/Los_Angeles

# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release

Reproduction steps

None, just an error callstack.

Relevant log output

[09/25/24 13:05:35] ERROR    Exception in ASGI application                      

                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /usr/src/app/main.py:152 in predict             │
                             │                                                 │
                             │   149 │   │   inputs = text                     │
                             │   150 │   else:                                 │
                             │   151 │   │   raise HTTPException(400, "Either  │
                             │ ❱ 152 │   response = await run_inference(inputs │
                             │   153 │   return ORJSONResponse(response)       │
                             │   154                                           │
                             │   155                                           │
                             │                                                 │
                             │ /usr/src/app/main.py:177 in run_inference       │
                             │                                                 │
                             │   174 │   without_deps, with_deps = entries     │
                             │   175 │   await asyncio.gather(*[_run_inference │
                             │   176 │   if with_deps:                         │
                             │ ❱ 177 │   │   await asyncio.gather(*[_run_infer │
                             │   178 │   if isinstance(payload, Image):        │
                             │   179 │   │   response["imageHeight"], response │
                             │   180                                           │
                             │                                                 │
                             │ /usr/src/app/main.py:170 in _run_inference      │
                             │                                                 │
                             │   167 │   │   │   │   message = f"Task {entry[' │
                             │       output of {dep}"                          │
                             │   168 │   │   │   │   raise HTTPException(400,  │
                             │   169 │   │   model = await load(model)         │
                             │ ❱ 170 │   │   output = await run(model.predict, │
                             │   171 │   │   outputs[model.identity] = output  │
                             │   172 │   │   response[entry["task"]] = output  │
                             │   173                                           │
                             │                                                 │
                             │ /usr/src/app/main.py:188 in run                 │
                             │                                                 │
                             │   185 │   if thread_pool is None:               │
                             │   186 │   │   return func(*args, **kwargs)      │
                             │   187 │   partial_func = partial(func, *args, * │
                             │ ❱ 188 │   return await asyncio.get_running_loop │
                             │   189                                           │
                             │   190                                           │
                             │   191 async def load(model: InferenceModel) ->  │
                             │                                                 │
                             │ /usr/local/lib/python3.11/concurrent/futures/th │
                             │ read.py:58 in run                               │
                             │                                                 │
                             │ /usr/src/app/models/base.py:60 in predict       │
                             │                                                 │
                             │    57 │   │   self.load()                       │
                             │    58 │   │   if model_kwargs:                  │
                             │    59 │   │   │   self.configure(**model_kwargs │
                             │ ❱  60 │   │   return self._predict(*inputs, **m │
                             │    61 │                                         │
                             │    62 │   @abstractmethod                       │
                             │    63 │   def _predict(self, *inputs: Any, **mo │
                             │                                                 │
                             │ /usr/src/app/models/facial_recognition/recognit │
                             │ ion.py:45 in _predict                           │
                             │                                                 │
                             │   42 │   │   │   return []                      │
                             │   43 │   │   inputs = decode_cv2(inputs)        │
                             │   44 │   │   cropped_faces = self._crop(inputs, │
                             │ ❱ 45 │   │   embeddings = self._predict_batch(c │
                             │      self._predict_single(cropped_faces)        │
                             │   46 │   │   return self.postprocess(faces, emb │
                             │   47 │                                          │
                             │   48 │   def _predict_batch(self, cropped_faces │
                             │      NDArray[np.float32]:                       │
                             │                                                 │
                             │ /usr/src/app/models/facial_recognition/recognit │
                             │ ion.py:49 in _predict_batch                     │
                             │                                                 │
                             │   46 │   │   return self.postprocess(faces, emb │
                             │   47 │                                          │
                             │   48 │   def _predict_batch(self, cropped_faces │
                             │      NDArray[np.float32]:                       │
                             │ ❱ 49 │   │   embeddings: NDArray[np.float32] =  │
                             │   50 │   │   return embeddings                  │
                             │   51 │                                          │
                             │   52 │   def _predict_single(self, cropped_face │
                             │      NDArray[np.float32]:                       │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/insightf │
                             │ ace/model_zoo/arcface_onnx.py:84 in get_feat    │
                             │                                                 │
                             │   81 │   │                                      │
                             │   82 │   │   blob = cv2.dnn.blobFromImages(imgs │
                             │   83 │   │   │   │   │   │   │   │   │     (sel │
                             │      self.input_mean), swapRB=True)             │
                             │ ❱ 84 │   │   net_out = self.session.run(self.ou │
                             │   85 │   │   return net_out                     │
                             │   86 │                                          │
                             │   87 │   def forward(self, batch_data):         │
                             │                                                 │
                             │ /usr/src/app/sessions/ort.py:49 in run          │
                             │                                                 │
                             │    46 │   │   input_feed: dict[str, NDArray[np. │
                             │    47 │   │   run_options: Any = None,          │
                             │    48 │   ) -> list[NDArray[np.float32]]:       │
                             │ ❱  49 │   │   outputs: list[NDArray[np.float32] │
                             │       run_options)                              │
                             │    50 │   │   return outputs                    │
                             │    51 │                                         │
                             │    52 │   @property                             │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:22 │
                             │ 0 in run                                        │
                             │                                                 │
                             │    217 │   │   if not output_names:             │
                             │    218 │   │   │   output_names = [output.name  │
                             │    219 │   │   try:                             │
                             │ ❱  220 │   │   │   return self._sess.run(output │
                             │    221 │   │   except C.EPFail as err:          │
                             │    222 │   │   │   if self._enable_fallback:    │
                             │    223 │   │   │   │   print(f"EP Error: {err!s │
                             ╰─────────────────────────────────────────────────╯
                             RuntimeException: [ONNXRuntimeError] : 6 :         
                             RUNTIME_EXCEPTION : Non-zero status code returned  
                             while running Conv node. Name:'Conv_85' Status     
                             Message:                                           
                             /onnxruntime_src/onnxruntime/core/framework/bfc_are
                             na.cc:376 void*                                    
                             onnxruntime::BFCArena::AllocateRawInternal(size_t, 
                             bool, onnxruntime::Stream*, bool,                  
                             onnxruntime::WaitNotificationFn) Failed to allocate
                             memory for requested buffer of size 308674560      

2024-09-25 13:05:38.937186818 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_60' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 168820736

Additional information

No response

mmomjian commented 3 days ago

Is this running in a LXC? Docker in LXC is not supported so we might need to be able to replicate this in a VM to investigate further

rayzorben commented 3 days ago

Yes, this is docker in LXC. I get 'not supported' but I run 27 containers in docker in LXC with no issues including Frigate which also uses ML and deals with video and images?

mmomjian commented 3 days ago

There’s probably a way to make it work, but LXC provides a whole other level of bugs and config issues that we just can’t reasonably handle. I’ll leave the issue open for now in case someone else sees a way to fix it or disagrees.

rayzorben commented 3 days ago

Is there a supported Immich LXC that I can use instead?

mmomjian commented 3 days ago

No, we don’t officially support LXC