pulsejet / memories

Fast, modern and advanced photo management suite. Runs as a Nextcloud app.
https://memories.gallery
GNU Affero General Public License v3.0
2.97k stars 79 forks source link

Generating timeline thumbnails results in 502s and 504s #962

Open benjaoming opened 7 months ago

benjaoming commented 7 months ago

Describe the bug

First of all, THANKS for an amazing application that fits my needs very accurately :dart: :handshake: :love_letter:

When rendering a larger amount of new photos in the Timeline, the browser's requests for individual thumbnails quickly start to return HTTP 502 and 504 errors.

I'm opening this issue following an investigation by Cloud68 engineers, I'll quote that investigation below. What I understand is that they are fairly experienced with Nextcloud and have followed the general troubleshooting steps for these cases.

I would also forward encouragements from Cloud68 that seem to be very eager to see this app succeed!

From my POV, it seems that the background generation of thumbnails doesn't succeed, and the UI doesn't indicate what's going on. I would find it easier if there was a simple job that pre-generated all thumbnails and I could monitor that job.

To Reproduce Steps to reproduce the behavior:

Screenshots

Something like this...

image

Platform:

Additional context

Investigation by Cloud68 team

Indeed, the cron job seems to be unrelated to improving the Memories experience, it's meant to help with the preview while using the Files app.

We tried to replicate your experience, using 1000 sample photos, and noticed while checking monitoring systems and other logs, that:

1) At the time of your screenshot, it seems the instance was under heavy load. For a one core instance (https://docs.cloud68.co/books/server-specs/page/starter-packages-server-specs), the load of 80:1 is quite high for a pleasant experience.

image

2) Memory utilization got quite high, which causes service crashing

image

3) A couple of 'remaining connection slots are reserved for non-replication superuser connections' messages on postgresql logs, although at a later time. We did increase the number of max_connections from 100 to 200.

image

4) Php-fpm was utilizing quite some resources (pm.max_children: 120). We left this value as is, for now

image

Items 1 & 2 happened to us too when we tried to access the photos with the File app, before the job for pre-generating thumbnails had run. Most of the thumbnails wouldn't load for a long time, and the limits on how long a request can be open kick in, returning 502 (because of the service crash due to high hardware usage) and 504 (Timeouts). After the cron command runs, which doesn't take long or use a lot of resources, experience is faster and doesn't cause this load and memory usage

Indeed, the thumbnails for Memories seem to be generated on the spot, and if you scroll while waiting for thumbnails, more thumbnails get generated, causing even more load and utilizing all resources, causing the same behavior as trying to generate all thumbnails on the spot with the File app. After the previews have been generated, the experience is extremely snappy, but that means that there isn't an option to pre-generate the thumbnails. You can also impersonate our account to see how the experience should be when the thumbnails have already been generated.

After some testing, it seems that Memories app can index files in the background, but not pre-generate thumbnails. We also didn't see another issue on their GitHub mentioning pre-generation of thumbnails. However, when allocating 2 cores and 4gb of ram to your instance, pre-generation wouldn't cause load to the level that the instance would go down, even though it would be temporarily laggy.

To make sure you can generate the thumbnails, we can temporarily increase the resources of your instance to allow you to generate all thumbnails (by patiently scrolling through the entire amount of Photos in Memories). After all images load the first time, subsequent reloads should be fast, even with the original resource allocation.

From other messages

(...) We've investigated the hypothesis that this is a manner of misconfiguration, cross-referencing it with Memories Docs, and even checking the admin UI panel.

and

We tried to re-add the images and run files:scan-app-data, but that did not pre-generate the thumbnails either.

pulsejet commented 7 months ago
  1. To pre generate the thumbnails you need the preview generator app with the cron job set up. https://memories.gallery/config/#recommended-apps
  2. There might be a bug in pre-generation though (this needs to be investigated) regarding the sizes generated. I'm not 100% sure the app pre-generates all the required sizes, but regardless it should be way faster with this (it does correctly generate the large thumbnail, which in turn is used to generate smaller ones on demand if required; this is much faster than generating the initial large preview)
  3. I believe there's a PHP extension that you can install that limits the concurrent preview generation, which might help here. Can't remember which one, but this is a fairly recent feature in the Nextcloud core.
benjaoming commented 7 months ago

@pulsejet thanks for the response! We did ensure that the previewgenerator app was installed and active. I'll see if I can fish out some extra confirmation that its cron job is working without issues.

pulsejet commented 7 months ago

Could very well be a bug then. If you're digging deeper into this, ImageController is where the preview fetching is handled. It first attempts to fetch the preview with multipreview; if no max preview is found then it'll fall back to an individual request.

3. I believe there's a PHP extension that you can install that limits the concurrent preview generation, which might help here. Can't remember which one, but this is a fairly recent feature in the Nextcloud core.

sysvsem -- make sure this extension is installed.