directus / directus

The flexible backend for all your projects 🐰 Turn your DB into a headless CMS, admin panels, or apps with a custom UI, instant APIs, auth & more.
https://directus.io
Other
27.2k stars 3.8k forks source link

Limit file metadata extraction based on byte size #11292

Closed chaosgrid closed 2 years ago

chaosgrid commented 2 years ago

Preflight Checklist

Describe the Bug

When uploading a large .tif file (~200 MB, dimensions: 5179 x 7200), directus crashes because it is running out of memory (5 GB+ needed).

Even if I disable all transformations, I dont understand why directus still tries to process the image in any way? I also added these environment vars but it does not help, the server still crashes with lots of mem usage:

ASSETS_TRANSFORM_MAX_CONCURRENT=1
ASSETS_TRANSFORM_MAX_OPERATIONS=0
ASSETS_TRANSFORM_IMAGE_MAX_DIMENSION=1

Is it normal for a 200 MB tif file with these dimensions to require 5GB+ of RAM? If it is, would it make sense to limit the transformation process by image file size rather than image dimensions?

You can download the offending file here: https://hivemind.htw-berlin.de/data/s/orzdG6LfytaLBMq

To Reproduce

Upload large .tif image with big dimensions and have less than ~5 GB of free RAM. Try this demo file: https://hivemind.htw-berlin.de/data/s/orzdG6LfytaLBMq

Errors Shown

directus-hc | [43:0x7fb94ca8b340] 65064930 ms: Scavenge (reduce) 4093.7 (4141.6) -> 4093.7 (4142.6) MB, 119.1 / 0.0 ms (average mu = 0.611, current mu = 0.325) allocation failure directus-hc | [43:0x7fb94ca8b340] 65071048 ms: Mark-sweep (reduce) 4094.7 (4142.6) -> 4094.6 (4143.4) MB, 6056.7 / 0.0 ms (+ 43.9 ms in 16 steps since start of marking, biggest step 5.5 ms, walltime since start of marking 6118 ms) (average mu = 0.428, current mu = 0.0 directus-hc | directus-hc | <--- JS stacktrace ---> directus-hc | directus-hc | FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory

What version of Directus are you using?

v9.5.0

What version of Node.js are you using?

whatever is packaged in the docker image directus/directus:latest

What database are you using?

PostgreSQL 13.5

What browser are you using?

Chrome

What operating system are you using?

Windows

How are you deploying Directus?

Docker (compose)

rijkvanzanten commented 2 years ago

If it is, would it make sense to limit the transformation process by image file size rather than image dimensions?

I think this is actually not caused by the transformation process, as that's only ran when you request a thumbnail for the first time, which should then be successfully limited by the dimension check. I think this might be caused by the metadata extraction, as that also has to read the file.

Realistically, the only way is to (like you said) add a new flag to control and prevent metadata extraction of files over X bytes. Reading the metadata (eg exif/iptc/etc) from a 200MB file will result in a huge amount of memory used, which is not something we can optimize for (as it's handled by sharp -> libvips).


Nice job on that poster btw, I'm quite fond of the art direction 🙂

chaosgrid commented 2 years ago

Realistically, the only way is to (like you said) add a new flag to control and prevent metadata extraction of files over X bytes. Reading the metadata (eg exif/iptc/etc) from a 200MB file will result in a huge amount of memory used, which is not something we can optimize for (as it's handled by sharp -> libvips).

Ah, I see. But 5GB+ on a 200mb file to read metadata seems excessive, is this a bug in sharp -> libvips? I mean directus has settings for transformation processes because of memory considerations, but if a 200mb file metadata lookup causes 5GB+ of RAM usage, it seems like that is way more problematic as there is no control over the metadata lookup as of right now.

Nice job on that poster btw, I'm quite fond of the art direction 🙂

Thanks but I didn't create the poster, it's from one of our student teams 🙂

wrynegade commented 2 years ago

I was looking into this, and it looks like the memory footprint of the metadata for this file is super low: sharp processes this file without delay.

For the file you provided, chaosgrid, there is an enormous array (length 93,000,000+ :open_mouth:) INSIDE the metadata which seems to crush the deepCopy inside UpdateOne. I've tried processing this array in few different ways, but it's just really really big.

Screenshot_select-area_20220328190837

The app should definitely not crash, but I'm wondering if this metadata is always so large for this kind of file. I'll go hunting for some additional files to try and replicate this phenomena to see if this is an outlier case or relatively common.