django-cms / django-filer

File and Image Management Application for django
https://django-filer.readthedocs.io/
Other
1.73k stars 574 forks source link

Restrict maximum image dimensions/resolution #1330

Closed petrklus closed 9 months ago

petrklus commented 1 year ago

With many server instances memory limited, I believe it would be useful to allow for maximum image dimensions to be checked and files rejected if larger.

As the memory usage is not internally checked, it is possible to exhaust the memory available and crash/slow down significantly any other request processing with just a single upload. We have found 6000x6000 pixels to be a safe limit for our case, though this would vary depending on the server resources available.

We have implemented the following monkey-patch to achieve this functionality:

settings.py

FILER_IMAGE_MAX_WIDTH = 6000
FILER_IMAGE_MAX_HEIGHT = 6000

Then we placed a monkey patch into one of our project files:

# Monkey-patch filer 2.2.4
import filer
from filer.models import Image
if filer.__version__ != "2.2.4":
    print("ERROR: Expecting filer version 2.2.4")
else: 
    print("Monkey-patching filer Image clean")
    prev_clean = Image.clean

    def clean(self):
        ret = prev_clean(self) # original            
        try:
            imgfile = self.file.file
        except ValueError:
            imgfile = self.file_ptr.file
        imgfile.seek(0)
        if self.mime_type == 'image/svg+xml':
            pass # noop, SVG image
        else:
            from filer.utils.compatibility import PILImage
            from sentry_sdk import capture_exception
            from django.core.exceptions import ValidationError     

            try:
                width, height = PILImage.open(imgfile).size
                print(width, height)
                if width > settings.FILER_IMAGE_MAX_WIDTH:
                    print("Check failed")
                    raise ValidationError({"file": "Image is too wide. Please refresh the tab to continue."})
                if height > settings.FILER_IMAGE_MAX_HEIGHT:
                    print("Check failed")
                    raise ValidationError({"file": "Image is too tall. Please refresh the tab to continue."})
            except ValidationError as ve:
                raise ve
            except Exception as e:
                print(e)            
                capture_exception(e)
                #raise Exception("Unable to process image: {}".format(e))
                raise ValidationError({
                    "file": "Unspecified error {}".format(e)
                })
        imgfile.seek(0)

        return ret

    Image.clean = clean # re-assign

The above seems to be working fine in the "full" admin view, however, no error is shown when large file is dragged into the file browser popup, nor there is any indication when dragged into the form field directly (the field just re-sets).

To start with though, I believe it would make sense to implement the size check as an option for Filer. Thoughts?

jrief commented 1 year ago

I would rather use a setting which already exists in Django, namely FILE_UPLOAD_MAX_MEMORY_SIZE.

petrklus commented 1 year ago

I would rather use a setting which already exists in Django, namely FILE_UPLOAD_MAX_MEMORY_SIZE.

What about image files that are small in size but high in resolution? Yes, there is generally a strong correlation between file size and resolution, however, you can have a pretty small file with huge dimensions. And when thumbnails are generated, it's the number of pixels that consumers memory. @jrief do you know what I mean? Let me know please if I am missing anything

petrklus commented 1 year ago

Perhaps rather than restricting width/height as I am doing now, it would make more sense to just have a single pixel count limit and check every image upload against it, I just find it easier to explain to editors that they have a certain size square canvas into which all they graphics need to fit in.

jrief commented 1 year ago

The problem with that approach is, that before you can start counting pixels / measuring the size, the complete image has to be uploaded. This can consume considerable memory inside Pillow. It usually is safer to restrict the payload, rather than the dimensions.

Moreover, images with low entropy but many pixels require less memory than images with high entropy but less pixels. For a user usually is easier to determine the size of a file rather than its dimensions.

petrklus commented 1 year ago

Hm, that is not my experience so far - I've checked memory usage with pillow to get size only and it's not loading the images in. That's why my fix is working, I get basically instant file rejection if it has larger dimensions right after the upload completes. Only once the resize operation starts the image gets loaded into memory.

This test illustrates when exactly PIL starts consuming memory:

def print_mem():
    import os, psutil
    process = psutil.Process(os.getpid())
    mb = process.memory_info().rss/1024.0/1024.0
    print(mb)  # in Mbytes 
    return mb

from PIL import Image as PILImage

print_mem()
x = PILImage.open("test_image_6000x6000.jpg")
print(x.size)
print_mem()
x.rotate(90, resample=0, expand=0, center=None, translate=None, fillcolor=None)
print_mem()

For the above, I get the following output:

50.765625 
(6000, 6000)
51.90625
327.71875

Seems that only after the rotate action starts, the image gets actually loaded into memory. Querying the image size seems "cheap".

petrklus commented 1 year ago

Moreover, images with low entropy but many pixels require less memory than images with high entropy but less pixels. For a user usually is easier to determine the size of a file rather than its dimensions.

Also, on this point - do you mean upload file size or memory allocated during thumbnailing? I am not sure the image entropy has much of an impact (if any) on the memory footprint of a resize operation. I have not tested it, just a hunch.

petrklus commented 1 year ago

@jrief have you seen, in your experience, the image size query to take up memory? If so, could you please share which platform / pillow version etc.?

petrklus commented 1 year ago

@jrief just a bump that I will proceed with creating a PR for the maximum dimensions feature, it's a feature pressingly need and would like to move away from monkey-patching filer

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

fsbraun commented 12 months ago

@petrklus I'd appreciate a PR on this. With filer 3.0 file size can be limited. Why not have both limitations (memory size and canvas size)?