matthewwithanm / django-imagekit

Automated image processing for Django. Currently v4.0
http://django-imagekit.rtfd.org/
BSD 3-Clause "New" or "Revised" License
2.24k stars 276 forks source link

Slow page rendering when using S3 #568

Closed silentjay closed 6 months ago

silentjay commented 6 months ago

My page loads are between 2.5s - 7s depending on how many thumbnails are on the page. Thumbnails file cache is being stored on backblaze (S3) via storages.

Here's the scenario:

Profiling via sentry I can see the reason for the latency is a load of S3 calls depending on how many thumbnails i'm loading.

image

I've gone through all the past issues and changed my settings what feels like a hundreds times I yet I can't solve this issue. Here's my current settings:

CACHES = {
    "default": {
        "BACKEND": "diskcache.DjangoCache",
        "LOCATION": BASE_DIR / "django_file_cache",
        "TIMEOUT": None,
        # ^-- Django setting for default timeout of each key.
        "SHARDS": 8,
        "DATABASE_TIMEOUT": 0.010,  # 10 milliseconds
        # ^-- Timeout for each DjangoCache database transaction.
        "OPTIONS": {"size_limit": 2**30},  # 1 gigabyte
    },
}
IMAGEKIT_DEFAULT_CACHEFILE_STRATEGY = "imagekit.cachefiles.strategies.Optimistic"
IMAGEKIT_CACHE_BACKEND = "default"

I have also tried swapping the cache to locmem to test if it was an issue with the cache:

CACHES = {
    "default": {
              ...
    },
    "imagekit": {"BACKEND": "django.core.cache.backends.locmem.LocMemCache", "TIMEOUT": None},
}
IMAGEKIT_DEFAULT_CACHEFILE_STRATEGY = "imagekit.cachefiles.strategies.Optimistic"
IMAGEKIT_CACHE_BACKEND = "imagekit"

But the result was exactly the same. I've also tried running generateimages during deployment. No change. Am complete out of ideas at this point.

silentjay commented 6 months ago

I need to do a little more testing but I think I've nailed it.

I noticed on one page with imagekit generated thumbnails on seemed to be immune from slow page loading irrelevant of changes to my settings. I decided to analyse what the difference was and noticed on the slow pages I was calling width and height on the thumbnails. This looks like what was causing the extra S3 network calls and hence the slow page rendering times and removing these calls fixed the issue.

Before I close this, do you think it's safe to continue to use LocMemCache in production to store the image cache status? I know there's warnings in the django docs about using it in production but seems like it could be a nice fast way of storing this data and even for a few thousand images wouldn't take up much memory long as I use one process?

vstoykov commented 6 months ago

Hey @silentjay thanks for your investigation. There was some old S3 related issues #325 and #256. I still can't help much on that because I'm not working on python projects from some time and I'm mainly triaging issues, merging PRs and releasing new versions.

I know that there is still room for improvement on that front and if there are someone willing to do the job it will be beneficial for the django-imagekit users.

About your question for LocMemCache the right answer is "it depends". The general recommendation is not to use it, because if you have multiple instances the cache is not shared and because when you restart your service the cache is cleared. If this is not a concern for your use case then you can continue use it (because this simplify the deployment a lot and is actually pretty fast because there is no inter-process communication). When the need to scale araise or the slowness after restart/new deployment is not acceptable, then you need to use other cache implementation.

silentjay commented 6 months ago

Sadly I wouldn't know where to even start creating a fix for this, above my pay grade sadly. I'm happy enough just avoiding calling an images width and height attributes for now and avoiding this issue.

Thanks for advice regarding the cache, if it starts becoming an issue on new deploys in future I'll look at moving it to persistent storage outside the docker container. I'll close this issue now.