immich-app / immich

High performance self-hosted photo and video management solution.
https://immich.app
GNU Affero General Public License v3.0
45.19k stars 2.19k forks source link

Internal server error - 500 - Internal Server Error undefined When Searching #4707

Closed jdrewsteiner closed 10 months ago

jdrewsteiner commented 10 months ago

The bug

[Nest] 7 - 10/30/2023, 12:55:44 AM ERROR [ExceptionsHandler] Request for clip failed with status 500: Internal Server Error Error: Request for clip failed with status 500: Internal Server Error at MachineLearningRepository.post (/usr/src/app/dist/infra/repositories/machine-learning.repository.js:29:19) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async SearchService.search (/usr/src/app/dist/domain/search/search.service.js:110:35) at async /usr/src/app/node_modules/@nestjs/core/router/router-execution-context.js:46:28 at async /usr/src/app/node_modules/@nestjs/core/router/router-proxy.js:9:17

The OS that Immich Server is running on

DSM 7.2-64570 Update 3 on Synology DS920+

Version of Immich Server

v1.83.0

Version of Immich Mobile App

1.83.0

Platform with the issue

Your docker-compose.yml content

version: "3.8"

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    command: ["start.sh", "immich"]
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - stack.env
    depends_on:
      - redis
      - database
      - typesense
    restart: always

  immich-microservices:
    container_name: immich_microservices
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    # extends:
    #   file: hwaccel.yml
    #   service: hwaccel
    command: ["start.sh", "microservices"]
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - stack.env
    depends_on:
      - redis
      - database
      - typesense
    restart: always

  immich-machine-learning:
    container_name: immich_machine_learning
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
    volumes:
      - model-cache:/cache
    env_file:
      - stack.env
    restart: always

  immich-web:
    container_name: immich_web
    image: ghcr.io/immich-app/immich-web:${IMMICH_VERSION:-release}
    env_file:
      - stack.env
    restart: always

  typesense:
    container_name: immich_typesense
    image: typesense/typesense:0.24.1@sha256:9bcff2b829f12074426ca044b56160ca9d777a0c488303469143dd9f8259d4dd
    environment:
      - TYPESENSE_API_KEY=${TYPESENSE_API_KEY}
      - TYPESENSE_DATA_DIR=/data
      # remove this to get debug messages
      - GLOG_minloglevel=1
    volumes:
      - tsdata:/data
    restart: always

  redis:
    container_name: immich_redis
    image: redis:6.2-alpine@sha256:70a7a5b641117670beae0d80658430853896b5ef269ccf00d1827427e3263fa3
    restart: always

  database:
    container_name: immich_postgres
    image: postgres:14-alpine@sha256:28407a9961e76f2d285dc6991e8e48893503cc3836a4755bbc2d40bcc272a441
    env_file:
      - stack.env
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
    volumes:
      - pgdata:/var/lib/postgresql/data
    restart: always

  immich-proxy:
    container_name: immich_proxy
    image: ghcr.io/immich-app/immich-proxy:${IMMICH_VERSION:-release}
    ports:
      - 2283:8080
    depends_on:
      - immich-server
      - immich-web
    restart: always

volumes:
  pgdata:
  model-cache:
  tsdata:

Your .env content

UPLOAD_LOCATION=xxx
IMMICH_VERSION=release
TYPESENSE_API_KEY=xxx
DB_PASSWORD=postgres
DB_HOSTNAME=immich_postgres
DB_USERNAME=postgres
DB_DATABASE_NAME=immich
REDIS_HOSTNAME=immich_redis

Reproduction steps

any time i use the search bar to search for photos

Additional information

No response

alextran1502 commented 10 months ago

Hello, please follow this guide at the mean time to fix the issue https://github.com/immich-app/immich/issues/4117#issuecomment-1772790612

jdrewsteiner commented 10 months ago

unfortunatly that does not seem to resolve the issue for me.

alextran1502 commented 10 months ago

Do you know if your CPU support AVX or AVX2 instruction? what is the log from the Typesense and Machine learning container say?

jdrewsteiner commented 10 months ago

Not sure. Here the Typsense log:

W20231029 22:42:54.406327 166 controller.cpp:1454] SIGINT was installed with 1 W20231029 22:42:54.406361 166 raft_server.cpp:570] Single-node with no leader. Resetting peers. W20231029 22:42:54.406368 166 node.cpp:894] node default_group:172.31.0.2:8107:8108 set_peer from 172.30.0.3:8107:8108 to 172.31.0.2:8107:8108 W20231029 22:43:05.121524 166 raft_server.cpp:570] Single-node with no leader. Resetting peers.

Machine Learning log:

Dashboard App Templates

Stacks Containers Images Networks Volumes Events Host

Settings [Users][REDACTED]

[Environments][REDACTED]

[Registries][REDACTED] [Authentication logs][REDACTED]

[Notifications][REDACTED] [Settings][REDACTED]

Community Edition 2.19.1 [Containers][REDACTED]>[immich_machine_learning][REDACTED]>Logs Container logs

[REDACTED] Log viewer settings Auto-refresh logs Wrap lines Display timestamps Fetch All logs Search Filter... Lines 100 Actions

raw_response = await run_endpoint_function(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/opt/venv/lib/python3.11/site-packages/fastapi/routing.py", line 163, in run_endpoint_function return await dependant.call(values) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/app/main.py", line 75, in predict model = await load(await app.state.model_cache.get(model_name, model_type, kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/app/main.py", line 101, in load await loop.run_in_executor(app.state.thread_pool, _load) File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, *self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/app/main.py", line 94, in _load model.load() File "/usr/src/app/models/base.py", line 63, in load self.download() File "/usr/src/app/models/base.py", line 58, in download self._download() File "/usr/src/app/models/clip.py", line 43, in _download self._download_model(models[0]) File "/usr/src/app/models/clip.py", line 101, in _download_model download_model( File "/opt/venv/lib/python3.11/site-packages/clip_server/model/pretrained_models.py", line 239, in download_model raise RuntimeError( RuntimeError: Failed to download https://clip-as-service.s3.us-east-2.amazonaws.com/models-436c69702d61732d53657276696365/onnx/ViT-B-32/textual.onnx within retry limit 3 [10/30/23 01:53:02] INFO Downloading clip model 'ViT-B-32::openai'.This may take a while.
Failed to download https://clip-as-service.s3.us-east-2.amazonaws.com/models-436c69702d61732d536572 76696365/onnx/ViT-B-32/textual.onnx with <HTTPError 403: 'Forbidden'> at the 0th attempt Failed to download https://clip-as-service.s3.us-east-2.amazonaws.com/models-436c69702d61732d536572 76696365/onnx/ViT-B-32/textual.onnx with <HTTPError 403: 'Forbidden'> at the 1th attempt Failed to download https://clip-as-service.s3.us-east-2.amazonaws.com/models-436c69702d61732d536572 76696365/onnx/ViT-B-32/textual.onnx with <HTTPError 403: 'Forbidden'> at the 2th attempt textual.onnx 0.0% • 0/100 bytes • ? • -:--:--

Exception in ASGI application Traceback (most recent call last): File "/opt/venv/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 435, in run_asgi result = await app( # type: ignore[func-returns-value] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in call return await self.app(scope, receive, send) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.11/site-packages/fastapi/applications.py", line 276, in call await super().call(scope, receive, send) File "/opt/venv/lib/python3.11/site-packages/starlette/applications.py", line 122, in call await self.middleware_stack(scope, receive, send) File "/opt/venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 184, in call raise exc File "/opt/venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 162, in call await self.app(scope, receive, _send) File "/opt/venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 79, in call raise exc File "/opt/venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 68, in call await self.app(scope, receive, sender) File "/opt/venv/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in call raise e File "/opt/venv/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in call await self.app(scope, receive, send) File "/opt/venv/lib/python3.11/site-packages/starlette/routing.py", line 718, in call await route.handle(scope, receive, send) File "/opt/venv/lib/python3.11/site-packages/starlette/routing.py", line 276, in handle await self.app(scope, receive, send) File "/opt/venv/lib/python3.11/site-packages/starlette/routing.py", line 66, in app response = await func(request) ^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.11/site-packages/fastapi/routing.py", line 237, in app raw_response = await run_endpoint_function( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.11/site-packages/fastapi/routing.py", line 163, in run_endpoint_function return await dependant.call(values) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/app/main.py", line 75, in predict model = await load(await app.state.model_cache.get(model_name, model_type, kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/app/main.py", line 101, in load await loop.run_in_executor(app.state.thread_pool, _load) File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, *self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/src/app/main.py", line 94, in _load model.load() File "/usr/src/app/models/base.py", line 63, in load self.download() File "/usr/src/app/models/base.py", line 58, in download self._download() File "/usr/src/app/models/clip.py", line 43, in _download self._download_model(models[0]) File "/usr/src/app/models/clip.py", line 101, in _download_model download_model( File "/opt/venv/lib/python3.11/site-packages/clip_server/model/pretrained_models.py", line 239, in download_model raise RuntimeError( RuntimeError: Failed to download https://clip-as-service.s3.us-east-2.amazonaws.com/models-436c69702d61732d53657276696365/onnx/ViT-B-32/textual.onnx within retry limit 3

alextran1502 commented 10 months ago

The error you see is the same as in the post I attached above. Please make sure to have the files downloaded in the correct location and restart the stack

jdrewsteiner commented 10 months ago

I placed them here: /volume1/docker/immich/cache

Maybe this is not the correct location

jdrewsteiner commented 10 months ago

Ok, it's not giving me an error, but it's very slow and doesn't seem to find anything I search for. I tried searching for a name of a person I tagged and nothing came up. Maybe my NAS doesn't have enough power. Or maybe it takes time to scan my library. I seem to recall seeing settings pertaining to face and object recognition. I don't seem them anymore.

alextran1502 commented 10 months ago

You will need to put it in /var/lib/docker/volumes/<volume-name>/_data

jdrewsteiner commented 10 months ago

There is no docker folder within /var/lib/ on my NAS.

jdrewsteiner commented 10 months ago

You will need to put it in /var/lib/docker/volumes/<volume-name>/_data

It appears to be working with them located where I have them, but it's not finding anything when I search. I've tried many basic searches.

Here's the log from the machine learning container: [10/30/23 02:46:35] INFO Starting gunicorn 21.2.0
[10/30/23 02:46:35] INFO Listening at: http://0.0.0.0:3003 (9)
[10/30/23 02:46:35] INFO Using worker: uvicorn.workers.UvicornWorker
[10/30/23 02:46:35] INFO Booting worker with pid: 10
[10/30/23 02:47:22] INFO Created in-memory cache with unloading disabled.
[10/30/23 02:47:22] INFO Initialized request thread pool with 4 threads.
[10/30/23 02:47:22] INFO Downloading clip model 'ViT-B-32::openai'.This may take a while.
textual.onnx 100.0% • 254.1/254.1 MB • 14.9 MB/s • 0:00:00

visual.onnx 100.0% • 351.5/351.5 MB • 14.4 MB/s • 0:00:00

[10/30/23 02:48:07] INFO Loading clip model 'ViT-B-32::openai'

jdrewsteiner commented 10 months ago

I found the "Jobs" page under admin and ran the machine learning tasks again. This seemed to make the search work. However, now I have the opposite problem: now search is finding too much. For instance a search for tree is grabbing almost all of my photos, even receipts and faces.