Open pauleseifert opened 1 month ago
We have fixed some of the issues on the latest update, please try to log out and log back in, it should clear the local database and resync with the server
Ref: #7752
I tried that a couple of times. #7752 should be included in the latest 1.98.2 but the problem persist, at least in my case.
@pauleseifert when you log out and log back in do you access over local IP or CF? Can you try over local IP if you haven't?
@alextran1502 I tried both before with the same issues. Tried it yet again on the newest version of 1.99.0 without any success. Any ideas on what to try?
The symptom looks like the mobile app cannot get all the asset information from the server, leads to missing info for other tasks.
What is your phone version and how many assets do you have on the server? How much Ram do you have on the server?
App version is also 1.99.0, same as the server version. iPhone 13@17.4. I have about 52k assets and the server has 16gb of RAM.
@alextran1502 I did a couple of reinstalls and I think the app itself is fine. Database synchronisation works, pictures from the server are slowly fetched (I put some docker cpu limitations because QTS seems to be sometimes unstable under full load) until at some point I run into socket timeouts and the JS out of memory issue. Below are the latest logs. Memory of the host is not depleted when it occurs (with at least 8GB left).
Do you have any ideas why I get this out of memory problem? If not, I wouldn't fully cross out that this is caused by QTS. I migrate to proper server hardware in the next couple of weeks so maybe this solves the problem.
[Nest] 7 - 03/28/2024, 9:20:55 AM ERROR [ExceptionsHandler] Connection terminated due to connection timeout
Error: Connection terminated due to connection timeout
at Connection.<anonymous> (/usr/src/app/node_modules/pg/lib/client.js:132:73)
at Object.onceWrapper (node:events:632:28)
at Connection.emit (node:events:518:28)
at Socket.<anonymous> (/usr/src/app/node_modules/pg/lib/connection.js:63:12)
at Socket.emit (node:events:518:28)
at TCP.<anonymous> (node:net:337:12)
<--- Last few GCs --->
[7:0x40f8e320000] 445614 ms: Mark-Compact (reduce) 2046.9 (2083.6) -> 2046.6 (2084.4) MB, 6896.27 / 0.01 ms (+ 83.5 ms in 13 steps since start of marking, biggest step 61.5 ms, walltime since start of marking 7018 ms) (average mu = 0.203, current mu = [7:0x40f8e320000] 456268 ms: Mark-Compact (reduce) 2047.7 (2084.4) -> 2046.9 (2084.6) MB, 7758.25 / 0.00 ms (+ 63.7 ms in 15 steps since start of marking, biggest step 20.0 ms, walltime since start of marking 7979 ms) (average mu = 0.235, current mu =
<--- JS stacktrace --->
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----
1: 0xca5580 node::Abort() [immich_server]
2: 0xb781f9 [immich_server]
3: 0xeca4d0 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [immich_server]
4: 0xeca7b7 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [immich_server]
5: 0x10dc505 [immich_server]
6: 0x10f4388 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [immich_server]
7: 0x10ca4a1 v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [immich_server]
8: 0x10cb635 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [immich_server]
9: 0x10a8c86 v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [immich_server]
10: 0x1503a16 v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [immich_server]
11: 0x7f8f98e59ef6
I am also experiencing this on android, i posted this on discord:
Im getting this error that crashes immich_server when building the timeline.
04/23/2024
11:06:53 PM
<--- Last few GCs --->
04/23/2024
11:06:53 PM
04/23/2024
11:06:53 PM
[6:0x5d52e330000] 917559 ms: Scavenge (reduce) 2044.8 (2081.5) -> 2044.1 (2081.8) MB, 7.28 / 0.00 ms (average mu = 0.338, current mu = 0.325) allocation failure;
04/23/2024
11:06:53 PM
[6:0x5d52e330000] 917618 ms: Scavenge (reduce) 2044.9 (2081.8) -> 2044.2 (2081.8) MB, 5.81 / 0.00 ms (average mu = 0.338, current mu = 0.325) allocation failure;
04/23/2024
11:06:53 PM
[6:0x5d52e330000] 917681 ms: Scavenge (reduce) 2045.0 (2081.8) -> 2044.4 (2082.0) MB, 4.54 / 0.00 ms (average mu = 0.338, current mu = 0.325) allocation failure;
04/23/2024
11:06:53 PM
04/23/2024
11:06:53 PM
04/23/2024
11:06:53 PM
<--- JS stacktrace --->
04/23/2024
11:06:53 PM
04/23/2024
11:06:53 PM
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
04/23/2024
11:06:53 PM
----- Native stack trace -----
04/23/2024
11:06:53 PM
04/23/2024
11:06:53 PM
1: 0xb84bd6 node::OOMErrorHandler(char const*, v8::OOMDetails const&) [immich_server]
04/23/2024
11:06:53 PM
2: 0xefeb90 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [immich_server]
04/23/2024
11:06:53 PM
3: 0xefee77 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [immich_server]
04/23/2024
11:06:53 PM
4: 0x1110885 [immich_server]
04/23/2024
11:06:53 PM
5: 0x1110e14 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [immich_server]
04/23/2024
11:06:53 PM
6: 0x1127d04 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::internal::GarbageCollectionReason, char const*) [immich_server]
04/23/2024
11:06:53 PM
7: 0x112851c v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [immich_server]
04/23/2024
11:06:53 PM
8: 0x112a67a v8::internal::Heap::HandleGCRequest() [immich_server]
04/23/2024
11:06:53 PM
9: 0x1095ce7 v8::internal::StackGuard::HandleInterrupts() [immich_server]
04/23/2024
11:06:53 PM
10: 0x1537542 v8::internal::Runtime_StackGuardWithGap(int, unsigned long*, v8::internal::Isolate*) [immich_server]
04/23/2024
11:06:53 PM
11: 0x7fc429e99ef6
Server crashes before reaching 2.3 gb of ram, when i added- NODE_OPTIONS="--max-old-space-size=8192"
it now reaches 6gb (i guess my max) on my 8gb system.
With the env variable i got this error:
Nest] 7 - 04/22/2024, 5:51:27 PM ERROR [ExceptionsHandler] Invalid string length
04/22/2024
07:51:27 PM
RangeError: Invalid string length
04/22/2024
07:51:27 PM
at JSON.stringify (<anonymous>)
04/22/2024
07:51:27 PM
at stringify (/usr/src/app/node_modules/express/lib/response.js:1159:12)
04/22/2024
07:51:27 PM
at ServerResponse.json (/usr/src/app/node_modules/express/lib/response.js:272:14)
04/22/2024
07:51:27 PM
at ExpressAdapter.reply (/usr/src/app/node_modules/@nestjs/platform-express/adapters/express-adapter.js:62:62)
04/22/2024
07:51:27 PM
at RouterResponseController.apply (/usr/src/app/node_modules/@nestjs/core/router/router-response-controller.js:15:36)
04/22/2024
07:51:27 PM
at /usr/src/app/node_modules/@nestjs/core/router/router-execution-context.js:176:48
04/22/2024
07:51:27 PM
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
04/22/2024
07:51:27 PM
at async /usr/src/app/node_modules/@nestjs/core/router/router-execution-context.js:47:13
04/22/2024
07:51:27 PM
at async /usr/src/app/node_modules/@nestjs/core/router/router-proxy.js:9:17
This is only for 18k photos for this user, total 30k. Is 6gb allocated not enough for 18k images?
This happens only on android, the timeline builds on web.
I'm running in docker v1.102.0, remote through cloudflare.
Help much appreciated.
Interesting. I tested it without cloudflare and the problem persists. Do you use QTS as a host system as well? I haven't found a solution yet but are working on migrating to Debian based truenas scale and a lot more ram and hope that the problem magically disappears. However, I had plenty of space left on my machine before (16gb).
I had hoped for #8755 to fix the issue, but this didn't happen.
The problem persist even when running locally. Im running it on ubuntu server with docker lastest version and android app latest , no NAS.
Hopefully more ram can help, but with my low asset count and high ram usage(6gb), something is broken. If devs need more logs or any help, im open to it.
The bug
I have experienced synchronisation issues between the mobile app on IOS and the server for quite a while. Symptoms are:
The file upload works as expected, and so does the web app. Reinstalling the mobile app doesn't help, nor does restarting the stack. Same on another (i)Phone. The domain is protected by Cloudflare, but local deployment sees the same behaviour.
Happy to help with more logs or for suggestions to fix if my setup is the problem.
The OS that Immich Server is running on
Docker on QTS 5.1.5
Version of Immich Server
1.98.2
Version of Immich Mobile App
1.98.2 build 144
Platform with the issue
Your docker-compose.yml content
Your .env content
Reproduction steps
Additional information
App log:
Immich_log_2024-03-17T14:45:27.092139.csv
Error message from the server container: