hoarder-app / hoarder

A self-hostable bookmark-everything app (links, notes and images) with AI-based automatic tagging and full text search
https://hoarder.app
GNU Affero General Public License v3.0
6.59k stars 236 forks source link

[Bug] Degraded performance when rendering large images in the bookmarkgrid #517

Open Antebios opened 1 month ago

Antebios commented 1 month ago

I have over 15k bookmarks that it took a long while to import. After it finally imported the UI performance was excruciatingly slow. I waited a few days later to try to use it again since it might have a background job, but it was still slow. I even restarted the service, but still slow. I have given up using it.

MohamedBassem commented 1 month ago

hey, thanks for creating the issue. However, it would be great if you can give us something to help you with here. Some suggestions:

  1. A screen recording or something showing us the slowness you're facing?
  2. A description of the hardware you're running hoarder on?
  3. You mentioned background jobs, maybe check the admin panel and see if there are more background jobs that are still running?
  4. Screenshots of the cpu utilization of the server you're running hoarder on?

Without any of this info, I'm not sure we can help you unfortunately.

kamtschatka commented 1 month ago

Yeah this will need more information. I actually added 15k bookmarks to my hoarder dev instance yesterday and I don't see any noticable slowdown. Since pagination is used, I also don't see why it would be affected that much by the 15k bookmarks. And yes, there are some queries that go through all the data on the server side, but 15k rows should not be that much of an issue for the database. So please provide the information above. Are you maybe adding a lot of pdf files to hoarder and the preview of the files takes too long? Or is it only on the "tags" page, because there is an issue open, that this takes very long to load and we will have to improve the performance there.

kissgyorgy commented 1 month ago

Hello! Great app, looks very promising!

I had the same feeling but only with 10 bookmarks imported that the UI seems very sluggish. I just did a Lighhouse test and it seems like the biggest problems are that huge pictures (screenshots of websites) are loaded on every page. I have CRAWLER_FULL_PAGE_SCREENSHOT=true and CRAWLER_FULL_PAGE_ARCHIVE=true and intend to use Hoarder this way.

Here is a Lighthouse HTML report I just did: lighthouse-report-huge-images.zip There is a 14Mb image. I see it's cached, but it still makes the UI very sluggish even after the first load. When I deleted those articles with huge screenshots, the UI became snappy. I think when opening individual articles with huge images is fine, I can wait for that, but shouldn't access navigation and other pages.

I think the problem is easy to reproduce just saving some big pages with full-page screenshots.

kamtschatka commented 1 month ago

hi, thanks that makes a lot of sense. We should probably calculate preview images for those cards and only deliver the full image, when you actually open the cards.

kamtschatka commented 1 month ago

@kissgyorgy would you be able to give me a list of the your bookmarks that caused this slowdown? I had a look just now and it turns out the nextjs image tag already takes care of downloading the correct size of the image for your screen, it was just misconfigured and always assumed you would show the image with full width and height. I have a fix for it now, but would like to confirm/have a look at the other issues the lighthouse report finds as well.

kissgyorgy commented 1 month ago

bookmarks.zip If you import this, there will be multiple page screenshots which will be over 10MB

kissgyorgy commented 1 month ago

I think, this is the problematic part:

<a target="_blank" rel="noreferrer" class="h-56 min-h-56 w-full object-cover rounded-t-lg" href="https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html"><div class="relative size-full flex-1"><img alt="card banner" loading="lazy" decoding="async" data-nimg="fill" class="h-56 min-h-56 w-full object-cover rounded-t-lg" sizes="100vw" srcset="/_next/image?url=%2Fapi%2Fassets%2F83b61e75-05f8-4bb7-bba9-aa94edb22115&amp;w=640&amp;q=75 640w, /_next/image?url=%2Fapi%2Fassets%2F83b61e75-05f8-4bb7-bba9-aa94edb22115&amp;w=750&amp;q=75 750w, /_next/image?url=%2Fapi%2Fassets%2F83b61e75-05f8-4bb7-bba9-aa94edb22115&amp;w=828&amp;q=75 828w, /_next/image?url=%2Fapi%2Fassets%2F83b61e75-05f8-4bb7-bba9-aa94edb22115&amp;w=1080&amp;q=75 1080w, /_next/image?url=%2Fapi%2Fassets%2F83b61e75-05f8-4bb7-bba9-aa94edb22115&amp;w=1200&amp;q=75 1200w, /_next/image?url=%2Fapi%2Fassets%2F83b61e75-05f8-4bb7-bba9-aa94edb22115&amp;w=1920&amp;q=75 1920w, /_next/image?url=%2Fapi%2Fassets%2F83b61e75-05f8-4bb7-bba9-aa94edb22115&amp;w=2048&amp;q=75 2048w, /_next/image?url=%2Fapi%2Fassets%2F83b61e75-05f8-4bb7-bba9-aa94edb22115&amp;w=3840&amp;q=75 3840w" src="/_next/image?url=%2Fapi%2Fassets%2F83b61e75-05f8-4bb7-bba9-aa94edb22115&amp;w=3840&amp;q=75" style="position: absolute; height: 100%; width: 100%; inset: 0px; color: transparent;"></div></a>

When I click on any of the links in the srcset, it says "Unable to optimize image and unable to fallback to upstream image" and only the one with w=1920 can be opened.

kamtschatka commented 1 month ago

yeah it is a bit strange. for some image it works and downloads scaled down versions (e.g. only 640 width, instead of 1440 width), which already saves quite a bit of size, but for e.g. the "What Every Computer Scientist Should Know About Floating-Point Arithmetic" URL it does not scale it down and is served as png. I think the biggest issue though is that there seems to be now way to reduce the height of the delivered image. the image is 47k pixels long, but obviously we are showing about 3 bookmarks below each other, so in any case we would not even show the first 1k pixels and should simply cut them off. I have already opened a PR that fixes the issue that we were always requesting the full size image, instead of 1/3rd of the image widht, but I'll have to look a bit more into cutting off the bottom of the image

kissgyorgy commented 1 month ago

Another solution might be to not even show the screenshots at all. I would be fine with that, as I just keep them in case other methods couldn't render the page properly.

Jayddo commented 1 month ago

would it make sense to just separate screenshots and thumbnails, and show thumbnails on the grid view?

MohamedBassem commented 1 month ago

@Jayddo we only show the screenshot as a fallback when the link doesn't have any image to show as a thumbnail.

kissgyorgy commented 1 month ago

Can I help anything with this?

RayBB commented 3 weeks ago

I only have 4 items in my library so far but I always see "Card Banner" and when I try to open any banner image from the source I also get the error about unable to optimize

image
MohamedBassem commented 3 weeks ago

@RayBB mind sharing one of the links?

RayBB commented 3 weeks ago

@MohamedBassem I'd rather not share a link to my instance but I uploaded a redacted har (as txt) here.

https://hoarder.example.com/_next/image?url=%2Fapi%2Fassets%2Fadbe220d-e5de-4bbd-a6e3-052a54054805&w=3840&q=75

hoarder.example.com_Archive [24-11-03 14-36-34].txt

RayBB commented 3 weeks ago

@MohamedBassem and here's the server side error when I make the request:

2024-11-03T19:41:11.548583536Z  ⨯ Error: ENOENT: no such file or directory, open ' /data/assets/acd3c0xfm4a0w8fofx94irt4/adbe220d-e5de-4bbd-a6e3-052a54054805/asset.bin'
2024-11-03T19:41:11.548638856Z     at async open (node:internal/fs/promises:638:25)
2024-11-03T19:41:11.548644256Z     at async Object.readFile (node:internal/fs/promises:1238:14)
2024-11-03T19:41:11.548648536Z     at async Promise.all (index 0)
2024-11-03T19:41:11.548653056Z     at async g (/app/apps/web/.next/server/chunks/815.js:1:1411)
2024-11-03T19:41:11.548656656Z     at async q (/app/apps/web/.next/server/app/api/assets/[assetId]/route.js:1:2200)
2024-11-03T19:41:11.548660336Z     at async /app/node_modules/next/dist/compiled/next-server/app-route.runtime.prod.js:6:36938
2024-11-03T19:41:11.548664096Z     at async eC.execute (/app/node_modules/next/dist/compiled/next-server/app-route.runtime.prod.js:6:27552)
2024-11-03T19:41:11.548667816Z     at async eC.handle (/app/node_modules/next/dist/compiled/next-server/app-route.runtime.prod.js:6:38272)
2024-11-03T19:41:11.548671576Z     at async doRender (/app/node_modules/next/dist/server/base-server.js:1345:42)
2024-11-03T19:41:11.548675176Z     at async cacheEntry.responseCache.get.routeKind (/app/node_modules/next/dist/server/base-server.js:1567:28) {
2024-11-03T19:41:11.548679016Z   errno: -2,
2024-11-03T19:41:11.548682256Z   code: 'ENOENT',
2024-11-03T19:41:11.548685576Z   syscall: 'open',
2024-11-03T19:41:11.548688856Z   path: ' /data/assets/acd3c0xfm4a0w8fofx94irt4/adbe220d-e5de-4bbd-a6e3-052a54054805/asset.bin'
2024-11-03T19:41:11.548692496Z }

PS: I'm running this in a docker compose setup.

RayBB commented 3 weeks ago

I see now when I add a new article the server logs also have a similar issue. Huh...

2024-11-03T19:43:04.368204628Z 2024-11-03T19:43:04.367Z info: [Crawler][30] Will crawl "https://spectrum.ieee.org/touchscreens" for link with id "qdon9r414z2vizqjypq78w8q"
2024-11-03T19:43:04.368513710Z 2024-11-03T19:43:04.368Z info: [Crawler][30] Attempting to determine the content-type for the url https://spectrum.ieee.org/touchscreens
2024-11-03T19:43:04.791903263Z 2024-11-03T19:43:04.791Z info: [search][31] Attempting to index bookmark with id qdon9r414z2vizqjypq78w8q ...
2024-11-03T19:43:04.867329742Z 2024-11-03T19:43:04.867Z info: [search][31] Completed successfully
2024-11-03T19:43:05.051728443Z  ⨯ Error: ENOENT: no such file or directory, open ' /data/assets/acd3c0xfm4a0w8fofx94irt4/bc3a5e44-3c1a-4b45-926c-37913bf310d0/asset.bin'
2024-11-03T19:43:05.051760483Z     at async open (node:internal/fs/promises:638:25)
2024-11-03T19:43:05.051764283Z     at async Object.readFile (node:internal/fs/promises:1238:14)
2024-11-03T19:43:05.051767123Z     at async Promise.all (index 0)
2024-11-03T19:43:05.051769683Z     at async g (/app/apps/web/.next/server/chunks/815.js:1:1411)
2024-11-03T19:43:05.051772403Z     at async q (/app/apps/web/.next/server/app/api/assets/[assetId]/route.js:1:2200)
2024-11-03T19:43:05.051775003Z     at async /app/node_modules/next/dist/compiled/next-server/app-route.runtime.prod.js:6:36938
2024-11-03T19:43:05.051777723Z     at async eC.execute (/app/node_modules/next/dist/compiled/next-server/app-route.runtime.prod.js:6:27552)
2024-11-03T19:43:05.051780403Z     at async eC.handle (/app/node_modules/next/dist/compiled/next-server/app-route.runtime.prod.js:6:38272)
2024-11-03T19:43:05.051797003Z     at async doRender (/app/node_modules/next/dist/server/base-server.js:1345:42)
2024-11-03T19:43:05.051800043Z     at async cacheEntry.responseCache.get.routeKind (/app/node_modules/next/dist/server/base-server.js:1567:28) {
2024-11-03T19:43:05.051802803Z   errno: -2,
2024-11-03T19:43:05.051805163Z   code: 'ENOENT',
2024-11-03T19:43:05.051807603Z   syscall: 'open',
2024-11-03T19:43:05.051809963Z   path: ' /data/assets/acd3c0xfm4a0w8fofx94irt4/bc3a5e44-3c1a-4b45-926c-37913bf310d0/asset.bin'
2024-11-03T19:43:05.051812643Z }
2024-11-03T19:43:05.056499463Z  ⨯ Error: ENOENT: no such file or directory, open ' /data/assets/acd3c0xfm4a0w8fofx94irt4/a30c591d-ed73-49be-9eb1-1e17ff90fa10/asset.bin'
2024-11-03T19:43:05.056545783Z     at async open (node:internal/fs/promises:638:25)
2024-11-03T19:43:05.056552703Z     at async Object.readFile (node:internal/fs/promises:1238:14)
2024-11-03T19:43:05.056558023Z     at async Promise.all (index 0)
2024-11-03T19:43:05.056562743Z     at async g (/app/apps/web/.next/server/chunks/815.js:1:1411)
2024-11-03T19:43:05.056567463Z     at async q (/app/apps/web/.next/server/app/api/assets/[assetId]/route.js:1:2200)
2024-11-03T19:43:05.056572424Z     at async /app/node_modules/next/dist/compiled/next-server/app-route.runtime.prod.js:6:36938
2024-11-03T19:43:05.056578624Z     at async eC.execute (/app/node_modules/next/dist/compiled/next-server/app-route.runtime.prod.js:6:27552)
2024-11-03T19:43:05.056583944Z     at async eC.handle (/app/node_modules/next/dist/compiled/next-server/app-route.runtime.prod.js:6:38272)
2024-11-03T19:43:05.056588984Z     at async doRender (/app/node_modules/next/dist/server/base-server.js:1345:42)
2024-11-03T19:43:05.056593824Z     at async cacheEntry.responseCache.get.routeKind (/app/node_modules/next/dist/server/base-server.js:1567:28) {
2024-11-03T19:43:05.056598864Z   errno: -2,
2024-11-03T19:43:05.056603264Z   code: 'ENOENT',
2024-11-03T19:43:05.056607704Z   syscall: 'open',
2024-11-03T19:43:05.056612144Z   path: ' /data/assets/acd3c0xfm4a0w8fofx94irt4/a30c591d-ed73-49be-9eb1-1e17ff90fa10/asset.bin'
2024-11-03T19:43:05.056617104Z }
2024-11-03T19:43:05.059977918Z  ⨯ Error: ENOENT: no such file or directory, open ' /data/assets/acd3c0xfm4a0w8fofx94irt4/adbe220d-e5de-4bbd-a6e3-052a54054805/asset.bin'
2024-11-03T19:43:05.060015318Z     at async open (node:internal/fs/promises:638:25)
2024-11-03T19:43:05.060020998Z     at async Object.readFile (node:internal/fs/promises:1238:14)
2024-11-03T19:43:05.060024638Z     at async Promise.all (index 0)
2024-11-03T19:43:05.060028078Z     at async g (/app/apps/web/.next/server/chunks/815.js:1:1411)
2024-11-03T19:43:05.060031238Z     at async q (/app/apps/web/.next/server/app/api/assets/[assetId]/route.js:1:2200)
2024-11-03T19:43:05.060034638Z     at async /app/node_modules/next/dist/compiled/next-server/app-route.runtime.prod.js:6:36938
2024-11-03T19:43:05.060052518Z     at async eC.execute (/app/node_modules/next/dist/compiled/next-server/app-route.runtime.prod.js:6:27552)
2024-11-03T19:43:05.060056838Z     at async eC.handle (/app/node_modules/next/dist/compiled/next-server/app-route.runtime.prod.js:6:38272)
2024-11-03T19:43:05.060060358Z     at async doRender (/app/node_modules/next/dist/server/base-server.js:1345:42)
2024-11-03T19:43:05.060063638Z     at async cacheEntry.responseCache.get.routeKind (/app/node_modules/next/dist/server/base-server.js:1567:28) {
2024-11-03T19:43:05.060067158Z   errno: -2,
2024-11-03T19:43:05.060070238Z   code: 'ENOENT',
2024-11-03T19:43:05.060073278Z   syscall: 'open',
2024-11-03T19:43:05.060076398Z   path: ' /data/assets/acd3c0xfm4a0w8fofx94irt4/adbe220d-e5de-4bbd-a6e3-052a54054805/asset.bin'
2024-11-03T19:43:05.060079918Z }
2024-11-03T19:43:05.061126563Z  ⨯ Error: ENOENT: no such file or directory, open ' /data/assets/acd3c0xfm4a0w8fofx94irt4/5bc24a35-2ea2-4a5f-bd5c-29384a551605/asset.bin'
2024-11-03T19:43:05.061145083Z     at async open (node:internal/fs/promises:638:25)
2024-11-03T19:43:05.061150043Z     at async Object.readFile (node:internal/fs/promises:1238:14)
2024-11-03T19:43:05.061153603Z     at async Promise.all (index 0)
2024-11-03T19:43:05.061156723Z     at async g (/app/apps/web/.next/server/chunks/815.js:1:1411)
2024-11-03T19:43:05.061160003Z     at async q (/app/apps/web/.next/server/app/api/assets/[assetId]/route.js:1:2200)
2024-11-03T19:43:05.061163403Z     at async /app/node_modules/next/dist/compiled/next-server/app-route.runtime.prod.js:6:36938
2024-11-03T19:43:05.061176643Z     at async eC.execute (/app/node_modules/next/dist/compiled/next-server/app-route.runtime.prod.js:6:27552)
2024-11-03T19:43:05.061180443Z     at async eC.handle (/app/node_modules/next/dist/compiled/next-server/app-route.runtime.prod.js:6:38272)
2024-11-03T19:43:05.061184163Z     at async doRender (/app/node_modules/next/dist/server/base-server.js:1345:42)
2024-11-03T19:43:05.061187843Z     at async cacheEntry.responseCache.get.routeKind (/app/node_modules/next/dist/server/base-server.js:1567:28) {
2024-11-03T19:43:05.061191683Z   errno: -2,
2024-11-03T19:43:05.061194723Z   code: 'ENOENT',
2024-11-03T19:43:05.061197963Z   syscall: 'open',
2024-11-03T19:43:05.061201243Z   path: ' /data/assets/acd3c0xfm4a0w8fofx94irt4/5bc24a35-2ea2-4a5f-bd5c-29384a551605/asset.bin'
2024-11-03T19:43:05.061205643Z }
2024-11-03T19:43:06.210525270Z 2024-11-03T19:43:06.210Z info: [Crawler][30] Content-type for the url https://spectrum.ieee.org/touchscreens is "text/html; charset=utf-8"
2024-11-03T19:43:07.556254929Z 2024-11-03T19:43:07.555Z info: [Crawler][30] Successfully navigated to "https://spectrum.ieee.org/touchscreens". Waiting for the page to load ...
2024-11-03T19:43:09.226766523Z 2024-11-03T19:43:09.226Z info: [Crawler][30] Finished waiting for the page to load.
2024-11-03T19:43:09.699540965Z 2024-11-03T19:43:09.697Z info: [Crawler][30] Finished capturing page content and a screenshot. FullPageScreenshot: false
2024-11-03T19:43:09.758585455Z 2024-11-03T19:43:09.752Z info: [Crawler][30] Will attempt to extract metadata from page ...
2024-11-03T19:43:11.858814069Z 2024-11-03T19:43:11.858Z info: [Crawler][30] Will attempt to extract readable content ...
2024-11-03T19:43:13.515570564Z 2024-11-03T19:43:13.515Z info: [Crawler][30] Done extracting readable content.
2024-11-03T19:43:13.537213296Z 2024-11-03T19:43:13.537Z info: [Crawler][30] Stored the screenshot as assetId: ed648e29-1d99-4af1-a9c9-b90753eb5a44
2024-11-03T19:43:13.583610732Z 2024-11-03T19:43:13.583Z info: [Crawler][30] Done extracting metadata from the page.
2024-11-03T19:43:13.583888133Z 2024-11-03T19:43:13.583Z info: [Crawler][30] Downloading image from "https://spectrum.ieee.org/media-library/close-up-angle-of-a-car-s-analog-dashboard-featuring-buttons-knobs-and-a-cd-slot.jpg?id=54089363&width=1200&height=600&coordinates=0%2C270%2C0%2C270"
2024-11-03T19:43:14.314261626Z 2024-11-03T19:43:14.313Z info: [Crawler][30] Downloaded image as assetId: df157e2f-fb90-48ca-a8ca-befd9bf30f04
2024-11-03T19:43:14.334407751Z 2024-11-03T19:43:14.334Z info: [Crawler][30] Completed successfully
2024-11-03T19:43:14.535553403Z 2024-11-03T19:43:14.533Z info: [search][33] Attempting to index bookmark with id qdon9r414z2vizqjypq78w8q ...
2024-11-03T19:43:14.546686370Z 2024-11-03T19:43:14.546Z debug: [inference][32] No inference client configured, nothing to do now
2024-11-03T19:43:14.546806531Z 2024-11-03T19:43:14.546Z info: [inference][32] Completed successfully
2024-11-03T19:43:14.625343863Z 2024-11-03T19:43:14.625Z info: [search][33] Completed successfully
MohamedBassem commented 3 weeks ago

@RayBB Sorry I wasn't clear. I meant I wanted one of the links that you added to Hoarder that doesn't get correctly rendered.

RayBB commented 3 weeks ago

https://spectrum.ieee.org/touchscreens

Antebios commented 3 weeks ago

Here are some screenshots amd clips:

image

image

image

image

Recording2024-10-12120538-ezgif com-video-to-gif-converter

ScreenRecording2024-11-03211136-ezgif com-video-to-gif-converter