immich-app / immich

High performance self-hosted photo and video management solution.
https://immich.app
GNU Affero General Public License v3.0
42.56k stars 2.08k forks source link

[BUG] Unable to upgrade past v1.71.0 #3587

Closed vulcanjedi closed 1 year ago

vulcanjedi commented 1 year ago

The bug

Unable to upgrade past v1.71.0 Have taken the past releases w/o issue

1.71.0 #1.70.0 #1.69.0 #1.68.0 #1.67.2 #1 #1.66.1 #1.65.0 #v1.63.2

The OS that Immich Server is running on

ubuntu docker

Version of Immich Server

v1.72.1

Version of Immich Mobile App

v1.72.1

Platform with the issue

Your docker-compose.yml content

version: "3.8"

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:v1.72.1
    #1.71.0 #1.70.0 #1.69.0 #1.68.0 #1.67.2 #1 #1.66.1 #1.65.0 #v1.63.2 #release
    #entrypoint: ["/bin/sh", "./start-server.sh"]
    command: [ "start.sh", "immich" ]
    volumes:
      - xxxxxxx/upload:/usr/src/app/upload
    env_file:
      - stack.env
    environment:
      - NODE_ENV=production
    depends_on:
      - immich_redis
      - immich_database
      - typesense
    restart: always

  immich-microservices:
    container_name: immich_microservices
    image: ghcr.io/immich-app/immich-server:v1.72.1
    #1.71.0 #1.70.0 #1.69.0 #1.68.0 #1.67.2 #1 #1.66.1 #1.65.0 #v1.63.2 #release
    #entrypoint: ["/bin/sh", "./start-microservices.sh"]
    command: [ "start.sh", "microservices" ]
    volumes:
      - /xxxxxx/upload:/usr/src/app/upload
    env_file:
      - stack.env
    depends_on:
      - immich_redis
      - immich_database
      - typesense
    restart: always

  immich-machine-learning:
    container_name: immich_machine_learning
    image: ghcr.io/immich-app/immich-machine-learning:v1.72.1
    #1.71.0 #1.70.0 #1.69.0 #1.68.0 #1.67.2 #1 #1.66.1 #1.65.0 #v1.63.2 #release
    volumes:
      - /xxxxx/cache:/cache
    env_file:
      - stack.env
    restart: always

  immich-web:
    container_name: immich_web

    image: ghcr.io/immich-app/immich-web:v1.72.1
    #1.71.0 #1.70.0 #1.69.0 #1.68.0 #1.67.2 #1 #1.66.1 #1.65.0 #v1.63.2 #release
    #entrypoint: ["/bin/sh", "./entrypoint.sh"]
    env_file:
      - stack.env
    restart: always

  typesense:
    hostname: typesense
    container_name: typesense
    image: typesense/typesense:0.24.0
   # environment:
   #   - TYPESENSE_API_KEY=${TYPESENSE_API_KEY}
   #   - TYPESENSE_DATA_DIR=/data
    env_file:
      - stack.env
    logging:
      driver: none
    volumes:
      - xxxxxxx/data:/data
    restart: always

  immich_redis:
    hostname: immich_redis
    container_name: immich_redis
    image: redis:6.2-alpine@sha256:70a7a5b641117670beae0d80658430853896b5ef269ccf00d1827427e3263fa3 #redis:6.2
    restart: always
    env_file:
      - stack.env

  immich_database:
    container_name: immich_postgres
    hostname: immich_database
    image: postgres:14
    env_file:
      - stack.env
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
      PG_DATA: /var/lib/postgresql/data
    volumes:
      - /xxxxxxxx/data:/var/lib/postgresql/data
    restart: always

  immich-proxy:
    container_name: immich_proxy
    image: ghcr.io/immich-app/immich-proxy:v1.72.1
    #1.71.0 #1.70.0 #1.69.0 #1.68.0 #1.67.2 #1 #1.66.1 #1.65.0 #v1.63.2 #release
    environment:
      # Make sure these values get passed through from the env file
      - IMMICH_SERVER_URL
      - IMMICH_WEB_URL
    env_file:
      - stack.env
    ports:
      - xxxx:8080
    logging:
      driver: none
    depends_on:
      - immich-server
      - immich-web
    restart: always

Your .env content

DB_HOSTNAME=immich_postgres
DB_USERNAME=xxxxx
DB_PASSWORD=xxxxxxxxxxxx
DB_DATABASE_NAME=xxxxxxx
REDIS_HOSTNAME=immich_redis
UPLOAD_LOCATION=/xxxxxxx/upload
TYPESENSE_API_KEY=xxxxxxxxxxxxxxxxxx
PUBLIC_LOGIN_PAGE_MESSAGE=
IMMICH_WEB_URL=http://immich-web:3000
IMMICH_SERVER_URL=http://immich-server:3001
IMMICH_MACHINE_LEARNING_URL=http://immich-machine-learning:3003
NODE_ENV=production
TYPESENSE_DATA_DIR=/data
IMMICH_VERSION=v1.72.1

Reproduction steps

Update Portainer compose yaml and env values.
Redeploy stack.
Server container wont get an IP and wont load properly as well as the microservices.
v1.72.0 and v1.72.1

Additional information

No response

alextran1502 commented 1 year ago

Please try to bring all the containers down and up again. Make sure to repull the images as well

vulcanjedi commented 1 year ago

Please try to bring all the containers down and up again. Make sure to repull the images as well

node[8]: ../src/node_platform.cc:68:std::unique_ptr node::WorkerThreadsTaskRunner::DelayedTaskScheduler::Start(): Assertion (0) == (uv_thread_create(t.get(), start_thread, this))' failed. 1: 0xb83f50 node::Abort() [node] 2: 0xb83fce [node] 3: 0xbf14ee [node] 4: 0xbf15d1 node::NodePlatform::NodePlatform(int, v8::TracingController*, v8::PageAllocator*) [node] 5: 0xb41ed3 node::InitializeOncePerProcess(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, node::ProcessFlags::Flags) [node] 6: 0xb4256b node::Start(int, char**) [node] 7: 0x7f334806e1ca [/lib/x86_64-linux-gnu/libc.so.6] 8: 0x7f334806e285 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6] 9: 0xac1fee _start [node] node[7]: ../src/node_platform.cc:68:std::unique_ptr<long unsigned int> node::WorkerThreadsTaskRunner::DelayedTaskScheduler::Start(): Assertion(0) == (uv_thread_create(t.get(), start_thread, this))' failed. 1: 0xb83f50 node::Abort() [node] 2: 0xb83fce [node] 3: 0xbf14ee [node] 4: 0xbf15d1 node::NodePlatform::NodePlatform(int, v8::TracingController, v8::PageAllocator) [node] 5: 0xb41ed3 node::InitializeOncePerProcess(std::vector<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::cxx11::basic_string<char, std::char_traits, std::allocator > > > const&, node::ProcessFlags::Flags) [node] 6: 0xb4256b node::Start(int, char) [node] 7: 0x7f1ab44761ca [/lib/x86_64-linux-gnu/libc.so.6] 8: 0x7f1ab4476285 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6] 9: 0xac1fee _start [node] node[8]: ../src/node_platform.cc:68:std::unique_ptr node::WorkerThreadsTaskRunner::DelayedTaskScheduler::Start(): Assertion `(0) == (uv_thread_create(t.get(), start_thread, this))' failed. 1: 0xb83f50 node::Abort() [node] 2: 0xb83fce [node] 3: 0xbf14ee [node] 4: 0xbf15d1 node::NodePlatform::NodePlatform(int, v8::TracingController, v8::PageAllocator) [node] 5: 0xb41ed3 node::InitializeOncePerProcess(std::vector<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::cxx11::basic_string<char, std::char_traits, std::allocator > > > const&, node::ProcessFlags::Flags) [node] 6: 0xb4256b node::Start(int, char) [node] 7: 0x7f2acc8bc1ca [/lib/x86_64-linux-gnu/libc.so.6] 8: 0x7f2acc8bc285 libc_start_main [/lib/x86_64-linux-gnu/libc.so.6] 9: 0xac1fee _start [node] node[7]: ../src/node_platform.cc:68:std::unique_ptr node::WorkerThreadsTaskRunner::DelayedTaskScheduler::Start(): Assertion `(0) == (uv_thread_create(t.get(), start_thread, this))' failed. 1: 0xb83f50 node::Abort() [node] 2: 0xb83fce [node] 3: 0xbf14ee [node] 4: 0xbf15d1 node::NodePlatform::NodePlatform(int, v8::TracingController, v8::PageAllocator) [node] 5: 0xb41ed3 node::InitializeOncePerProcess(std::vector<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::cxx11::basic_string<char, std::char_traits, std::allocator > > > const&, node::ProcessFlags::Flags) [node] 6: 0xb4256b node::Start(int, char**) [node] 7: 0x7f41ca0351ca [/lib/x86_64-linux-gnu/libc.so.6] 8: 0x7f41ca035285 libc_start_main [/lib/x86_64-linux-gnu/libc.so.6] 9: 0xac1fee _start [node] node[7]: ../src/node_platform.cc:68:std::unique_ptr node::WorkerThreadsTaskRunner::DelayedTaskScheduler::Start(): Assertion (0) == (uv_thread_create(t.get(), start_thread, this))' failed. 1: 0xb83f50 node::Abort() [node] 2: 0xb83fce [node] 3: 0xbf14ee [node] 4: 0xbf15d1 node::NodePlatform::NodePlatform(int, v8::TracingController*, v8::PageAllocator*) [node] 5: 0xb41ed3 node::InitializeOncePerProcess(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, node::ProcessFlags::Flags) [node] 6: 0xb4256b node::Start(int, char**) [node] 7: 0x7f61148091ca [/lib/x86_64-linux-gnu/libc.so.6] 8: 0x7f6114809285 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6] 9: 0xac1fee _start [node] node[7]: ../src/node_platform.cc:68:std::unique_ptr<long unsigned int> node::WorkerThreadsTaskRunner::DelayedTaskScheduler::Start(): Assertion(0) == (uv_thread_create(t.get(), start_thread, this))' failed. 1: 0xb83f50 node::Abort() [node] 2: 0xb83fce [node] 3: 0xbf14ee [node] 4: 0xbf15d1 node::NodePlatform::NodePlatform(int, v8::TracingController, v8::PageAllocator) [node] 5: 0xb41ed3 node::InitializeOncePerProcess(std::vector<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::cxx11::basic_string<char, std::char_traits, std::allocator > > > const&, node::ProcessFlags::Flags) [node] 6: 0xb4256b node::Start(int, char) [node] 7: 0x7fad16f421ca [/lib/x86_64-linux-gnu/libc.so.6] 8: 0x7fad16f42285 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6] 9: 0xac1fee _start [node] node[7]: ../src/node_platform.cc:68:std::unique_ptr node::WorkerThreadsTaskRunner::DelayedTaskScheduler::Start(): Assertion `(0) == (uv_thread_create(t.get(), start_thread, this))' failed. 1: 0xb83f50 node::Abort() [node] 2: 0xb83fce [node] 3: 0xbf14ee [node] 4: 0xbf15d1 node::NodePlatform::NodePlatform(int, v8::TracingController, v8::PageAllocator) [node] 5: 0xb41ed3 node::InitializeOncePerProcess(std::vector<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::cxx11::basic_string<char, std::char_traits, std::allocator > > > const&, node::ProcessFlags::Flags) [node] 6: 0xb4256b node::Start(int, char) [node] 7: 0x7f8b083eb1ca [/lib/x86_64-linux-gnu/libc.so.6] 8: 0x7f8b083eb285 libc_start_main [/lib/x86_64-linux-gnu/libc.so.6] 9: 0xac1fee _start [node] node[7]: ../src/node_platform.cc:68:std::unique_ptr node::WorkerThreadsTaskRunner::DelayedTaskScheduler::Start(): Assertion `(0) == (uv_thread_create(t.get(), start_thread, this))' failed. 1: 0xb83f50 node::Abort() [node] 2: 0xb83fce [node] 3: 0xbf14ee [node] 4: 0xbf15d1 node::NodePlatform::NodePlatform(int, v8::TracingController, v8::PageAllocator) [node] 5: 0xb41ed3 node::InitializeOncePerProcess(std::vector<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::cxx11::basic_string<char, std::char_traits, std::allocator > > > const&, node::ProcessFlags::Flags) [node] 6: 0xb4256b node::Start(int, char**) [node] 7: 0x7fc2f77691ca [/lib/x86_64-linux-gnu/libc.so.6] 8: 0x7fc2f7769285 libc_start_main [/lib/x86_64-linux-gnu/libc.so.6] 9: 0xac1fee _start [node]

alextran1502 commented 1 year ago

Which log is it coming from and which machine are you running your ubuntu server on?

vulcanjedi commented 1 year ago

Which log is it coming from and which machine are you running your ubuntu server on?

That was the log via portainer for the immich server container. Not sure you mean but my ubuntu instance is serviced from my home via a Supermicro X9SCL/X9SCM

jrasm91 commented 1 year ago

It looks like the local images are corrupt. You should delete them and then re-pull them again.

vulcanjedi commented 1 year ago

It looks like the local images are corrupt. You should delete them and then re-pull them again.

I docker rmi the images and force removed all the containers and repulled and redeployed w/ Portainer but same behavior

Chris-Marlow commented 1 year ago

Confirming scenario reported by @vulcanjedi above. Seeing same symptoms, and performed the same troubleshooting including removing all images and containers and redeployed w/ Portainer.

If it matters, it appears (to me) the typesense container is producing zero log output, and the immich_server container is failing when trying to connect to the typesense container.

2023-08-08T01:40:45.842557913Z Request #1691458785779: Request to Node 0 failed due to "undefined Request failed with HTTP code 503 | Server said: Not Ready or Lagging"

2023-08-08T01:40:45.842591826Z Request #1691458785779: Sleeping for 4s and then retrying request...

2023-08-08T01:40:49.845685238Z /usr/src/app/node_modules/typesense/lib/Typesense/Errors/TypesenseError.js:23

2023-08-08T01:40:49.845732680Z         var _this = _super.call(this, message) || this;

2023-08-08T01:40:49.845744905Z                            ^

2023-08-08T01:40:49.845785217Z 

2023-08-08T01:40:49.845798897Z ServerError: Request failed with HTTP code 503 | Server said: Not Ready or Lagging

2023-08-08T01:40:49.845805948Z     at ServerError.TypesenseError [as constructor] (/usr/src/app/node_modules/typesense/lib/Typesense/Errors/TypesenseError.js:23:28)

2023-08-08T01:40:49.845831166Z     at new ServerError (/usr/src/app/node_modules/typesense/lib/Typesense/Errors/ServerError.js:25:42)

2023-08-08T01:40:49.845838946Z     at ApiCall.customErrorForResponse (/usr/src/app/node_modules/typesense/lib/Typesense/ApiCall.js:347:21)

2023-08-08T01:40:49.845845921Z     at /usr/src/app/node_modules/typesense/lib/Typesense/ApiCall.js:204:58

2023-08-08T01:40:49.845852603Z     at step (/usr/src/app/node_modules/typesense/lib/Typesense/ApiCall.js:33:23)

2023-08-08T01:40:49.845859246Z     at Object.next (/usr/src/app/node_modules/typesense/lib/Typesense/ApiCall.js:14:53)

2023-08-08T01:40:49.845866060Z     at step (/usr/src/app/node_modules/typesense/lib/Typesense/ApiCall.js:18:139)

2023-08-08T01:40:49.845872917Z     at Object.next (/usr/src/app/node_modules/typesense/lib/Typesense/ApiCall.js:14:53)

2023-08-08T01:40:49.845879723Z     at fulfilled (/usr/src/app/node_modules/typesense/lib/Typesense/ApiCall.js:5:58)

2023-08-08T01:40:49.845886435Z     at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {

2023-08-08T01:40:49.845893196Z   httpStatus: 503

2023-08-08T01:40:49.845899873Z }

2023-08-08T01:40:49.845906458Z 

2023-08-08T01:40:49.845913070Z Node.js v18.17.0
alextran1502 commented 1 year ago

@Chris-Marlow does it start eventually?

vulcanjedi commented 1 year ago

Confirming scenario reported by @vulcanjedi above. Seeing same symptoms, and performed the same troubleshooting including removing all images and containers and redeployed w/ Portainer.

If it matters, it appears (to me) the typesense container is producing zero log output, and the immich_server container is failing when trying to connect to the typesense container.

2023-08-08T01:40:45.842557913Z Request #1691458785779: Request to Node 0 failed due to "undefined Request failed with HTTP code 503 | Server said: Not Ready or Lagging"

2023-08-08T01:40:45.842591826Z Request #1691458785779: Sleeping for 4s and then retrying request...

2023-08-08T01:40:49.845685238Z /usr/src/app/node_modules/typesense/lib/Typesense/Errors/TypesenseError.js:23

2023-08-08T01:40:49.845732680Z         var _this = _super.call(this, message) || this;

2023-08-08T01:40:49.845744905Z                            ^

2023-08-08T01:40:49.845785217Z 

2023-08-08T01:40:49.845798897Z ServerError: Request failed with HTTP code 503 | Server said: Not Ready or Lagging

2023-08-08T01:40:49.845805948Z     at ServerError.TypesenseError [as constructor] (/usr/src/app/node_modules/typesense/lib/Typesense/Errors/TypesenseError.js:23:28)

2023-08-08T01:40:49.845831166Z     at new ServerError (/usr/src/app/node_modules/typesense/lib/Typesense/Errors/ServerError.js:25:42)

2023-08-08T01:40:49.845838946Z     at ApiCall.customErrorForResponse (/usr/src/app/node_modules/typesense/lib/Typesense/ApiCall.js:347:21)

2023-08-08T01:40:49.845845921Z     at /usr/src/app/node_modules/typesense/lib/Typesense/ApiCall.js:204:58

2023-08-08T01:40:49.845852603Z     at step (/usr/src/app/node_modules/typesense/lib/Typesense/ApiCall.js:33:23)

2023-08-08T01:40:49.845859246Z     at Object.next (/usr/src/app/node_modules/typesense/lib/Typesense/ApiCall.js:14:53)

2023-08-08T01:40:49.845866060Z     at step (/usr/src/app/node_modules/typesense/lib/Typesense/ApiCall.js:18:139)

2023-08-08T01:40:49.845872917Z     at Object.next (/usr/src/app/node_modules/typesense/lib/Typesense/ApiCall.js:14:53)

2023-08-08T01:40:49.845879723Z     at fulfilled (/usr/src/app/node_modules/typesense/lib/Typesense/ApiCall.js:5:58)

2023-08-08T01:40:49.845886435Z     at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {

2023-08-08T01:40:49.845893196Z   httpStatus: 503

2023-08-08T01:40:49.845899873Z }

2023-08-08T01:40:49.845906458Z 

2023-08-08T01:40:49.845913070Z Node.js v18.17.0

Agree I've had issues w/ typesense in the past when first trying out Immich but once I got past it everything has been great specially w/ the latest updates / features. I seem to no really be able to go to 0.24.1 on typesense reliably either. But this seems a little different in the past all the containers I think stay 'up' w/ IPs...etc This behavior the server/microservices container try and eventually crash and restart over/over Just saw release 1.72.2 pushed. Tried that. (I realize the changelog doesnt mention any presumed related changes but just for reference) No change, same behavior....reverting back to 1.71.0 instantly works.

Chris-Marlow commented 1 year ago

@alextran1502 it appears the answer is a qualified "maybe". Following yesterday's post I left the containers running, and the main Immich page is loading now instead of giving a 503. However, if I'm reading the logs for immich_server correctly, it appears it may have given up on waiting, then moved on? Containers started @0134 UTC, then around 0206UTC the below error came out (truncated). 25 seconds later the Nest application tries again and appears to come up somewhat clean.

Full logs from the immich_server container: immich_server_logs_20230808.txt

To determine if it was a "first time start issue" or a "every time start issue" I stopped the entire stack, then restarted. The full stack started within 7 seconds with no errors, and seems to run.

2023-08-08T02:06:35.350217710Z     at Timeout.<anonymous> (/usr/src/app/node_modules/follow-redirects/index.js:169:12)

2023-08-08T02:06:35.350225856Z     at listOnTimeout (node:internal/timers:569:17)

2023-08-08T02:06:35.350231263Z     at process.processTimers (node:internal/timers:512:7) {

2023-08-08T02:06:35.350236629Z   config: {

2023-08-08T02:06:35.350241891Z     transitional: {

2023-08-08T02:06:35.350247323Z       silentJSONParsing: true,

2023-08-08T02:06:35.350252780Z       forcedJSONParsing: true,

2023-08-08T02:06:35.350258167Z       clarifyTimeoutError: false

2023-08-08T02:06:35.350263449Z     },

2023-08-08T02:06:35.350268861Z     adapter: [Function: httpAdapter],

2023-08-08T02:06:35.350274358Z     transformRequest: [ [Function: transformRequest] ],

2023-08-08T02:06:35.350279963Z     transformResponse: [ [Function (anonymous)] ],

2023-08-08T02:06:35.350285496Z     timeout: 10000,

2023-08-08T02:06:35.350290867Z     xsrfCookieName: 'XSRF-TOKEN',

2023-08-08T02:06:35.350296428Z     xsrfHeaderName: 'X-XSRF-TOKEN',

2023-08-08T02:06:35.350301985Z     maxContentLength: Infinity,

2023-08-08T02:06:35.350307537Z     maxBodyLength: Infinity,

2023-08-08T02:06:35.350313085Z     validateStatus: [Function: validateStatus],

2023-08-08T02:06:35.350318723Z     headers: {

2023-08-08T02:06:35.350324261Z       Accept: 'application/json, text/plain, */*',

2023-08-08T02:06:35.350329940Z       'Content-Type': 'application/json',

2023-08-08T02:06:35.350335559Z       'X-TYPESENSE-API-KEY': 'some-random-text',

2023-08-08T02:06:35.350341094Z       'User-Agent': 'axios/0.26.1',

2023-08-08T02:06:35.350346709Z       'Content-Length': 1651

2023-08-08T02:06:35.350352223Z     },

2023-08-08T02:06:35.350357705Z     method: 'post',

2023-08-08T02:06:35.350363207Z     url: 'http://typesense:8108/collections',

2023-08-08T02:06:35.350391239Z     data: '{"...........
Chris-Marlow commented 1 year ago

Well that's frustrating. I tore down the stack, removed the images, and re-deployed the stack in portainer because I wanted to check some timings (same troubleshooting steps from yesterday), and the app started cleanly. No errors at all..... Same version of Immich (1.72.2)

@vulcanjedi are you able to confirm the same this morning?

vulcanjedi commented 1 year ago

Well that's frustrating. I tore down the stack, removed the images, and re-deployed the stack in portainer because I wanted to check some timings (same troubleshooting steps from yesterday), and the app started cleanly. No errors at all..... Same version of Immich (1.72.2)

@vulcanjedi are you able to confirm the same this morning?

I'm afraid I was not fortunate enough to encounter to the same phenomenon you were @Chris-Marlow . I did a docker image prune -a still noticed a image for the CLI in the repos that I couldn't delete was hoping maybe that was the smoking gun. The container wasn't running or showing in other commands, able to force kill orphaned container, delete all images from the immich repos. Killed the stack. Force reploy 1.72.2 and pull images. Same outcome for me.
The constant errs in the server/microsvcs container logs.
Portainer doesn't show them attaching IPs...etc Typesense/proxy containers fail to show logs in Portainer. Unable to bin/bash to consoles for server/microsvcs/proxy, but am able to console into Typesense.

Chris-Marlow commented 1 year ago

@vulcanjedi - reviewing your config, it appears you're pulling typesnse v0.24.0. I believe this should be updated to v0.24.1 according to https://documentation.immich.app/docs/install/docker-compose. Perhaps try changing and see if it resolves your issue?

vulcanjedi commented 1 year ago

@vulcanjedi - reviewing your config, it appears you're pulling typesnse v0.24.0. I believe this should be updated to v0.24.1 according to https://documentation.immich.app/docs/install/docker-compose. Perhaps try changing and see if it resolves your issue? Hi @Chris-Marlow I've tried and get sporadic issues w/ 0.24.1 as well before this most recent issue. I breifly think I may have got 0.24.1 up during this troubleshooting but had to revert to get my stack back up. Like you mentioned at the beginning I colloquially think some relationship w/ typesense, all the containers up and server container is good to go but microservices: Request #1691513755054: Request to Node 0 failed due to "ENOTFOUND getaddrinfo ENOTFOUND typesense" Request #1691513755054: Sleeping for 4s and then retrying request... Request #1691513755054: Request to Node 0 failed due to "ENOTFOUND getaddrinfo ENOTFOUND typesense" Request #1691513755054: Sleeping for 4s and then retrying request... Request #1691513755054: Request to Node 0 failed due to "ENOTFOUND getaddrinfo ENOTFOUND typesense" Request #1691513755054: Sleeping for 4s and then retrying request...

Chris-Marlow commented 1 year ago

Was worth a shot. I'm a bit over my skis here, so hopefully someone else has a good suggestion for you.

canu1337 commented 1 year ago

So I had the same issue, typesense was responding 503 errors to the server. When analyzing logs from typesense, the queue was overloaded (more than 500 items), even when starting only typesense without the rest of the stack. So I figured that the queue was stored in the volume attached to typesense. I deleted the volume and it works fine now, search (which is powered by typesense) is ok too.

Here is the steps to delete the typesense volume when using docker compose:

vulcanjedi commented 1 year ago

So I had the same issue, typesense was responding 503 errors to the server. When analyzing logs from typesense, the queue was overloaded (more than 500 items), even when starting only typesense without the rest of the stack. So I figured that the queue was stored in the volume attached to typesense. I deleted the volume and it works fine now, search (which is powered by typesense) is ok too.

Here is the steps to delete the typesense volume when using docker compose:

  • Stop the running containers docker-compose down
  • Delete the containers (volumes can't be deleted if attached to a stopped container) docker container prune
  • Delete the volume docker volume rm immich_tsdata
  • Restart everything docker-compose up -d @canu1337 Could you share the log locations? I think my issue may be related to my older legacy docker / ubuntu versions....I tried your resolution but wasn't successful so I think has to do docker/os version?
canu1337 commented 1 year ago

I'm running docker 23.0.1 on Debian 11.6 (5.10.0-21-amd64). You can check the log using docker logs immich_typesense (replace immich_typesense by whatever the name of your container is).