matthewwithanm / django-imagekit

Automated image processing for Django. Currently v4.0
http://django-imagekit.rtfd.org/
BSD 3-Clause "New" or "Revised" License
2.24k stars 276 forks source link

Accessing ImageSpecField url is extremely slow #256

Open Proper-Job opened 10 years ago

Proper-Job commented 10 years ago

Hi I use ImageSpecFields to create renditions for ModelAdmin image fields. When I access the url property of the ImageSpecField I notice an enormous delay. On a production machine accessing 11 url properties will take a total of 1.7 seconds, slowing down the response time by a factor of 20. I have really narrowed it down, so I can confidently say that whatever happens behind the scenes when I access the ImageSpecField's url property is eating up that time.

Do you have any idea how I can speed up things. I already use db caching which gave me an improvement of about .5 seconds from 2.2 seconds down to 1.7 seconds.

Any help would be greatly appreciated.

I use the following settings:

DEBUG = False
CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.db.DatabaseCache',
        'LOCATION': 'my_cache_table',
        'TIMEOUT': 157680000, # 5 years
    }
}
IMAGEKIT_DEFAULT_CACHEFILE_STRATEGY = 'imagekit.cachefiles.strategies.Optimistic'
IMAGEKIT_DEFAULT_CACHEFILE_BACKEND = 'imagekit.cachefiles.backends.Simple'
IMAGEKIT_CACHE_BACKEND = 'default'
matthewwithanm commented 10 years ago

There must be some network operation going on, though there shouldn't be if you're using the optimistic strategy. Can you do any logging to see what that might be? Also, are you sure it's the "url" property and not the dimensions? Those aren't currently cached so they can get very expensive.

Proper-Job commented 10 years ago

Hi Matt Thank you very much for you prompt reply.

What makes you think that there must be a Network operation going on?
I use the default file storage so the images reside locally. I return the url property (definitely url, I don't use the dimension anywhere) as part of a JSON response. Once I remove the url from the JSON response the reponse time drops to a few milliseconds. I'm not sure where and what I should be logging.

Thank you and greetings Moritz

matthewwithanm commented 10 years ago

Sorry, I was just assuming that it must have been network latency with that kind of slowness. Do you think you could use something like django-debug-toolbar to see where the time is being spent?

Proper-Job commented 10 years ago

I have already spent a fair amount poking around with the debug toolbar. The application is deployed on a Windows 2008 Server with Apache 2.2. I'm having some problems with the debug toolbar and this setup. But have a look at the attached profiling output. It's not solving the issue for me atm. Thank you and greetings Moritz debug

Proper-Job commented 10 years ago

Hi Matt After studying the profiling I attached yesterday a bit more closely I noticed that a lot of CPU time is being spent in the init method of ImageCacheFile. The implementation is quite trivial however. Do you have a hunch why the CPU time is being spent in that method?

Proper-Job commented 10 years ago

I'm starting to seriously suspect the Windows machine as being the culprit. Same setup on my development box is giving me much better performance.

matthewwithanm commented 10 years ago

It must be the hasher.

In order to identify images, IK uses filenames which, in the case of specs, are generated by creating a hash of the source filename, the processors, and a few other properties using Pickle. There are a few ways you can override this behavior:

For the sake of troubleshooting, we should probably try the first option to get as close to the suspected cause (the hasher) as possible. It should probably still use the source's filename though so we can eliminate that as a contributing factor. (It shouldn't be if the sources are regular Django files.)

Do you think you can try that and report back? I'm sorry the project is giving you a rough time but I appreciate your help in figuring this out!

Proper-Job commented 10 years ago

Hi Matt Thanks a lot. I will try it out and report back. Greetings Moritz

Proper-Job commented 10 years ago

Hi Matt I'm a bit stumped right now how I could optimize the get_hash implementation. My initial thought was to just md5 the source name and the generator id or any 2 values unique to that ImageSpec. I'm not sure how to get to those from within the get_hash implementation.

I see that you're pickling a number of properties and then hashing that pickle. Is that why it's taking so long (on Windows)?

Thank you and greetings Moritz

matthewwithanm commented 10 years ago

@Alp-Phone Sorry I wasn't very clear!

Is that why it's taking so long (on Windows)?

That's what I mean to find out. I don't want you to have to permanently change the get_hash implementation (at least not yet)—I'm just asking you to change it temporarily to something that doesn't use the Pickle hasher. That way we can either pinpoint it as the cause or rule it out. For the sake of this test, md5ing the source and a property or two should be fine. You can access the source using self.source.

Proper-Job commented 10 years ago

Hi Matt

So I finally got around to testing the performance with a custom get_hash implementation. The results are staggering. Using a custom ImageSpec get_hash implementation that does a simple string concatenation (no pickling, hashing) the response is 6x faster than with the standard implementation (see the ab results below). Specifically the time per request drops from 1114.194 [ms] to 181.031 [ms].

def get_hash(self):
    return self.source.name + ' - ' + 'slideshow_phone_normal'

First run (unmodified get_hash)

Cachfile strategy: 'imagekit.cachefiles.strategies.Optimistic' Unmodified get_hash implementation

→ ab -n 30 "http://my-domain.com/api/1.0/resorts/detail?resort_id=4&device_type=iphone&language_identifier=de" This is ApacheBench, Version 2.3 <$Revision: 655654 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking my-domain.com (be patient).....done

Server Software: Apache/2.2.24 Server Hostname: my-domain.com Server Port: 80

Document Path: /api/1.0/resorts/detail?resort_id=4&device_type=iphone&language_identifier=de Document Length: 11422 bytes

Concurrency Level: 1 Time taken for tests: 33.426 seconds Complete requests: 30 Failed requests: 0 Write errors: 0 Total transferred: 355680 bytes HTML transferred: 342660 bytes Requests per second: 0.90 #/sec Time per request: 1114.194 ms Time per request: 1114.194 [ms](mean, across all concurrent requests) Transfer rate: 10.39 [Kbytes/sec] received

Connection Times (ms) min mean[+/-sd] median max Connect: 982 1113 204.3 1077 2168 Processing: 0 1 2.6 0 13 Waiting: 0 0 0.0 0 0 Total: 982 1114 204.1 1077 2168

Percentage of the requests served within a certain time (ms) 50% 1077 66% 1091 75% 1101 80% 1105 90% 1182 95% 1224 98% 2168 99% 2168 100% 2168 (longest request)

Second run (modified get_hash)

Cachfile strategy: 'imagekit.cachefiles.strategies.Optimistic' Modified get_hash() to:

def get_hash(self):
    return self.source.name + ' - ' + 'slideshow_phone_normal'

→ ab -n 30 "http://my-domain.com/api/1.0/resorts/detail?resort_id=4&device_type=iphone&language_identifier=de" This is ApacheBench, Version 2.3 <$Revision: 655654 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking my-domain.com (be patient).....done

Server Software: Apache/2.2.24 Server Hostname: my-domain.com Server Port: 80

Document Path: /api/1.0/resorts/detail?resort_id=4&device_type=iphone&language_identifier=de Document Length: 11604 bytes

Concurrency Level: 1 Time taken for tests: 5.431 seconds Complete requests: 30 Failed requests: 0 Write errors: 0 Total transferred: 361140 bytes HTML transferred: 348120 bytes Requests per second: 5.52 #/sec Time per request: 181.031 ms Time per request: 181.031 [ms](mean, across all concurrent requests) Transfer rate: 64.94 [Kbytes/sec] received

Connection Times (ms) min mean[+/-sd] median max Connect: 26 27 1.1 27 32 Processing: 148 154 4.5 153 170 Waiting: 89 92 3.3 91 107 Total: 176 181 4.5 181 196

Percentage of the requests served within a certain time (ms) 50% 181 66% 182 75% 182 80% 183 90% 187 95% 189 98% 196 99% 196 100% 196 (longest request)

matthewwithanm commented 10 years ago

This is great data; thank you. Can to do me a favor and add one more thing to your benchmarks? A version of get_hash that's the same as the original but doesn't include the processors:

    def get_hash(self):
        return hashers.pickle([
            self.source.name,
            self.format,
            self.options,
            self.autoconvert,
        ])

Also, can you remove all but one image from the page so we can be sure caching isn't coming into play? Finally, which processors are you using?

Thanks again for digging!

Proper-Job commented 10 years ago

Hi Matt

I just edited this comment because I am revising the bench mark data. Now I get results more along the lines of what we would expect. Perhaps all other things were not equal in my previous runs. See the bench mark results below:

I set up a new feed that only returns the url for a single image.

{
 "url": "https://my-domain.com/media/CACHE/images/images/HAP103264I/0d1ca7094aacd5ffb626294fbd787491.jpg"
}

The cache file strategy is still 'imagekit.cachefiles.strategies.Optimistic'.

The first run bench marks the unmodified get_hash implementation and the second run bench marks the customized get_hash implementation (see custom get_hash and processors below).

from imagekit import hashers
class Slideshow(ImageSpec):
    """
    All tablet renditions are 4:3
    All phone renditions are 3:2
    """

    format = 'JPEG'
    options = {'quality': DEFAULT_JPEG_QUALITY}

    @property
    def processors(self):
        model, field_name = get_field_info(self.source)
        image_field = getattr(model, field_name)
        width = self.width
        height = self.height
        if image_field.height > image_field.width:
            width = self.height
            height = self.width
        return [ResizeToFit(width=width, height=height, anchor=Anchor.CENTER, upscale=True)]

# Phone

class SlideshowPhoneNormal(Slideshow):
    width = 480
    height = 320

    def get_hash(self):
        return hashers.pickle([
            self.source.name,
            self.format,
            self.options,
            self.autoconvert,
        ])

register.generator('backend:slideshow_phone_normal', SlideshowPhoneNormal)

First run (unmodified get_hash)

→ ab -n 100 "http://my-domain.com/api/1.0/imagekit_test?resort_id=4&device_type=iphone&language_identifier=de" This is ApacheBench, Version 2.3 <$Revision: 655654 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking my-domain.com (be patient).....done

Server Software: Apache/2.2.24 Server Hostname: my-domain.com Server Port: 80

Document Path: /api/1.0/imagekit_test?resort_id=4&device_type=iphone&language_identifier=de Document Length: 119 bytes

Concurrency Level: 1 Time taken for tests: 51.565 seconds Complete requests: 100 Failed requests: 0 Write errors: 0 Total transferred: 53700 bytes HTML transferred: 11900 bytes Requests per second: 1.94 #/sec Time per request: 515.653 ms Time per request: 515.653 [ms](mean, across all concurrent requests) Transfer rate: 1.02 [Kbytes/sec] received

Connection Times (ms) min mean[+/-sd] median max Connect: 206 395 69.1 398 684 Processing: 0 121 77.2 72 249 Waiting: 0 121 77.2 72 249 Total: 290 516 108.8 530 684

Percentage of the requests served within a certain time (ms) 50% 530 66% 614 75% 619 80% 629 90% 636 95% 643 98% 680 99% 684 100% 684 (longest request)

Second run (modified get_hash)

→ ab -n 100 "http://my-domain.com/api/1.0/imagekit_test?resort_id=4&device_type=iphone&language_identifier=de" This is ApacheBench, Version 2.3 <$Revision: 655654 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking my-domain.com (be patient).....done

Server Software: Apache/2.2.24 Server Hostname: my-domain.com Server Port: 80

Document Path: /api/1.0/imagekit_test?resort_id=4&device_type=iphone&language_identifier=de Document Length: 119 bytes

Concurrency Level: 1 Time taken for tests: 8.253 seconds Complete requests: 100 Failed requests: 0 Write errors: 0 Total transferred: 53700 bytes HTML transferred: 11900 bytes Requests per second: 12.12 #/sec Time per request: 82.531 ms Time per request: 82.531 [ms](mean, across all concurrent requests) Transfer rate: 6.35 [Kbytes/sec] received

Connection Times (ms) min mean[+/-sd] median max Connect: 26 28 3.0 27 50 Processing: 50 54 5.0 53 79 Waiting: 49 54 5.0 53 78 Total: 77 82 6.4 81 118

Percentage of the requests served within a certain time (ms) 50% 81 66% 82 75% 83 80% 84 90% 90 95% 98 98% 107 99% 118 100% 118 (longest request)

Proper-Job commented 10 years ago

Is there anything else I can do?

matthewwithanm commented 10 years ago

Sorry, @Alp-Phone! I've been extremely busy. (Plus GitHub doesn't notify when you update comments.) I'll try to look at it soon. What might be helpful is to run some benchmarks that don't actually go through the whole request-response cycle, instead just targeting the URL generation. Also, could you see how long it takes, on average, to run hashers.pickle([ResizeToFit(width=width, height=height, anchor=Anchor.CENTER, upscale=True)])? Sorry, again and thanks for all the help!

Proper-Job commented 10 years ago

Sure thing. Will let you know once I got some results.

Proper-Job commented 10 years ago

Hi Matt

So I finally got around to profiling the url generation outside of the request / response cycle.

A word about the results: The modified get_hash implementation is 36% faster. These tests were run on my MacBook Pro development machine. Originally I addressed this problem because we were deploying on Windows Server 2008 R2 and were experiencing extreme slowness. My gut feeling is that the difference on Unix like OSes is not so drastic.

I used a django managment command and profilehooks to profile execution time. I made 999 iterations of the url generation to create some bigger numbers for us to look at. I did 2 runs, the first with an unmodified get_hash implementation on the ImageSpec. The second run has the custom get_hash implementation you suggested:

Management command

# -*- coding: utf-8 -*-
from django.core.management.base import BaseCommand
from backend.models import Resort
from profilehooks import profile

@profile(dirs=True, entries=100)
def get_url(slideshow_image):
    for x in xrange(1,1000):
        slideshow_image.slideshow_phone_normal.url

class Command(BaseCommand):
    help = 'Benchmarks ImageKit ImageSpec url generation outside of the request response cycle'

    def handle(self, *args, **options):
        resort = Resort.objects.get(pk=3)
        slideshow_image = resort.slideshow_images.all()[0]      
        get_url(slideshow_image)

Custom get_hash

from imagekit import hashers
class Slideshow(ImageSpec):
    """
    All tablet renditions are 4:3
    All phone renditions are 3:2
    """

    format = 'JPEG'
    options = {'quality': DEFAULT_JPEG_QUALITY}

    @property
    def processors(self):
        model, field_name = get_field_info(self.source)
        image_field = getattr(model, field_name)
        width = self.width
        height = self.height
        if image_field.height > image_field.width:
            width = self.height
            height = self.width
        return [ResizeToFit(width=width, height=height, anchor=Anchor.CENTER, upscale=True)]

# Phone

class SlideshowPhoneNormal(Slideshow):
    width = 480
    height = 320

    def get_hash(self):
        return hashers.pickle([ResizeToFit(width=self.width, height=self.height, anchor=Anchor.CENTER, upscale=True)])

First run (unmodified get_hash)

*** PROFILER RESULTS ***

get_url (/frontend/management/commands/imagekittest.py:6)
function called 1 times

         567504 function calls (540333 primitive calls) in 0.729 seconds

   Ordered by: cumulative time, internal time, call count
   List reduced from 419 to 100 due to restriction <100>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.003    0.003    0.729    0.729 /frontend/management/commands/imagekittest.py:6(get_url)
      999    0.006    0.000    0.643    0.001 /imagekit/models/fields/utils.py:10(__get__)
      999    0.011    0.000    0.579    0.001 /imagekit/cachefiles/__init__.py:20(__init__)
      999    0.006    0.000    0.552    0.001 /imagekit/specs/__init__.py:90(cachefile_name)
      999    0.015    0.000    0.530    0.001 /imagekit/cachefiles/namers.py:13(source_name_as_path)
      999    0.005    0.000    0.468    0.000 /imagekit/specs/__init__.py:132(get_hash)
      999    0.007    0.000    0.373    0.000 /imagekit/hashers.py:28(pickle)
      999    0.001    0.000    0.358    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:220(dump)
24975/999    0.077    0.000    0.356    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:269(save)
 1998/999    0.006    0.000    0.351    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:591(save_list)
 1998/999    0.007    0.000    0.339    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:608(_batch_appends)
      999    0.005    0.000    0.209    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:345(save_reduce)
     1998    0.008    0.000    0.152    0.000 /imagekit/hashers.py:18(save_dict)
     1998    0.010    0.000    0.123    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:655(_batch_setitems)
    17982    0.053    0.000    0.096    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:227(memoize)
      999    0.007    0.000    0.086    0.000 /backend/imagegenerators.py:148(processors)
      999    0.001    0.000    0.083    0.000 /imagekit/cachefiles/__init__.py:82(url)
      999    0.004    0.000    0.082    0.000 /imagekit/cachefiles/__init__.py:72(_storage_attr)
     7068    0.029    0.000    0.073    0.000 {__import__}
     7992    0.015    0.000    0.068    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:480(save_string)
      999    0.001    0.000    0.066    0.000 /django/core/files/images.py:18(_get_height)
     1998    0.009    0.000    0.065    0.000 /django/core/files/images.py:22(_get_image_dimensions)
        1    0.000    0.000    0.056    0.056 /django/core/files/images.py:29(get_image_dimensions)
        5    0.000    0.000    0.055    0.011 /PIL/ImageFile.py:316(feed)
      999    0.005    0.000    0.055    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:532(save_tuple)
        5    0.000    0.000    0.055    0.011 /PIL/Image.py:1951(open)
        4    0.000    0.000    0.053    0.013 /PIL/Image.py:348(init)
      999    0.002    0.000    0.053    0.000 /imagekit/specs/__init__.py:232(get_spec)
      999    0.003    0.000    0.050    0.000 /imagekit/registry.py:32(get)
    57942    0.048    0.000    0.048    0.000 {method 'write' of 'cStringIO.StringO' objects}
      999    0.002    0.000    0.046    0.000 /django/core/files/storage.py:265(url)
     2997    0.014    0.000    0.045    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:730(save_global)
      999    0.004    0.000    0.045    0.000 /imagekit/specs/__init__.py:86(__init__)
      999    0.003    0.000    0.040    0.000 /imagekit/specs/__init__.py:37(__init__)
        8    0.000    0.000    0.039    0.005 /re.py:188(compile)
        8    0.000    0.000    0.039    0.005 /re.py:226(_compile)
        8    0.000    0.000    0.039    0.005 /sre_compile.py:496(compile)
     3996    0.011    0.000    0.037    0.000 /imagekit/utils.py:27(get_by_qname)
        1    0.000    0.000    0.035    0.035 /PIL/XbmImagePlugin.py:22(<module>)
        8    0.000    0.000    0.035    0.004 /sre_parse.py:676(parse)
     24/8    0.000    0.000    0.035    0.004 /sre_parse.py:302(_parse_sub)
     24/8    0.033    0.001    0.035    0.004 /sre_parse.py:380(_parse)
     2997    0.004    0.000    0.033    0.000 /imagekit/utils.py:49(get_singleton)
35014/34015    0.020    0.000    0.032    0.000 {getattr}
      999    0.006    0.000    0.030    0.000 /django/dispatch/dispatcher.py:147(send)
      999    0.011    0.000    0.030    0.000 /Python.framework/Versions/2.7/lib/python2.7/urlparse.py:250(urljoin)
      999    0.010    0.000    0.022    0.000 /imagekit/cachefiles/backends.py:13(get_default_cachefile_backend)
     4029    0.006    0.000    0.021    0.000 /django/utils/importlib.py:18(import_module)
    17982    0.016    0.000    0.020    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:250(put)
      999    0.010    0.000    0.019    0.000 {method '__reduce_ex__' of 'object' objects}
      999    0.003    0.000    0.015    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:492(save_unicode)
     3996    0.006    0.000    0.015    0.000 /imagekit/specs/__init__.py:97(source)
      999    0.003    0.000    0.015    0.000 /imagekit/cachefiles/strategies.py:37(__init__)
      999    0.003    0.000    0.015    0.000 /pilkit/utils.py:120(suggest_extension)
     1998    0.006    0.000    0.014    0.000 /Python.framework/Versions/2.7/lib/python2.7/urlparse.py:136(urlparse)
      999    0.002    0.000    0.013    0.000 /django/utils/encoding.py:218(filepath_to_uri)
    55988    0.013    0.000    0.013    0.000 {method 'get' of 'dict' objects}
      999    0.001    0.000    0.013    0.000 /imagekit/registry.py:52(existence_required_receiver)
     1004    0.007    0.000    0.012    0.000 /python2.7/posixpath.py:321(normpath)
      999    0.003    0.000    0.012    0.000 /imagekit/registry.py:55(_receive)
     1998    0.007    0.000    0.011    0.000 /django/db/models/fields/files.py:151(__get__)
      999    0.008    0.000    0.010    0.000 /python2.7/copy_reg.py:59(_reduce_ex)
     5005    0.005    0.000    0.009    0.000 /django/core/files/base.py:32(__nonzero__)
    63936    0.009    0.000    0.009    0.000 {id}
     1999    0.006    0.000    0.009    0.000 /python2.7/posixpath.py:68(join)
     1998    0.009    0.000    0.009    0.000 {sorted}
     1998    0.002    0.000    0.009    0.000 /python2.7/posixpath.py:104(splitext)
      999    0.004    0.000    0.009    0.000 /django/dispatch/dispatcher.py:212(_live_receivers)
    28971    0.009    0.000    0.009    0.000 {repr}
      999    0.001    0.000    0.008    0.000 /imagekit/utils.py:135(call_strategy_method)
      999    0.001    0.000    0.007    0.000 /pilkit/utils.py:104(format_to_extension)
     1998    0.005    0.000    0.007    0.000 /python2.7/genericpath.py:85(_splitext)
     2997    0.004    0.000    0.007    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:437(save_int)
    15385    0.007    0.000    0.007    0.000 {isinstance}
      999    0.002    0.000    0.007    0.000 /django/utils/encoding.py:138(force_bytes)
     1999    0.005    0.000    0.006    0.000 {method 'encode' of 'unicode' objects}
     4002    0.004    0.000    0.006    0.000 /django/conf/__init__.py:51(__getattr__)
     1998    0.002    0.000    0.005    0.000 /django/dispatch/dispatcher.py:9(_make_id)
9007/9006    0.005    0.000    0.005    0.000 {hasattr}
    24975    0.005    0.000    0.005    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:333(persistent_id)
     1998    0.003    0.000    0.005    0.000 /Python.framework/Versions/2.7/lib/python2.7/urlparse.py:167(urlsplit)
     5005    0.004    0.000    0.004    0.000 /django/core/files/base.py:29(__bool__)
23357/23314    0.004    0.000    0.004    0.000 {len}
      999    0.002    0.000    0.004    0.000 /Python.framework/Versions/2.7/lib/python2.7/urllib.py:1245(quote)
        1    0.000    0.000    0.004    0.004 /pilkit/utils.py:28(_preinit_pil)
        6    0.002    0.000    0.004    0.001 /PIL/Image.py:313(preinit)
        8    0.000    0.000    0.004    0.000 /sre_compile.py:481(_code)
     3997    0.003    0.000    0.003    0.000 {method 'rindex' of 'str' objects}
     1998    0.002    0.000    0.003    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:430(save_bool)
     4006    0.003    0.000    0.003    0.000 {method 'startswith' of 'unicode' objects}
     1998    0.002    0.000    0.003    0.000 <string>:8(__new__)
     1998    0.001    0.000    0.003    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:426(save_none)
     65/8    0.001    0.000    0.003    0.000 /sre_compile.py:33(_compile)
     8183    0.003    0.000    0.003    0.000 {method 'append' of 'list' objects}
      999    0.003    0.000    0.003    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:173(__init__)
     1000    0.001    0.000    0.003    0.000 /django/utils/functional.py:202(inner)
      999    0.002    0.000    0.002    0.000 {_hashlib.openssl_md5}
     3996    0.002    0.000    0.002    0.000 {method 'rfind' of 'unicode' objects}
     2005    0.002    0.000    0.002    0.000 {method 'split' of 'str' objects}
      999    0.000    0.000    0.002    0.000 /imagekit/utils.py:58(autodiscover)

Second run (modified get_hash)

*** PROFILER RESULTS ***

get_url (/frontend/management/commands/imagekittest.py:6)
function called 1 times

         430249 function calls (412261 primitive calls) in 0.536 seconds

   Ordered by: cumulative time, internal time, call count
   List reduced from 188 to 100 due to restriction <100>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.003    0.003    0.536    0.536 /frontend/management/commands/imagekittest.py:6(get_url)
      999    0.005    0.000    0.451    0.000 /imagekit/models/fields/utils.py:10(__get__)
      999    0.011    0.000    0.387    0.000 /imagekit/cachefiles/__init__.py:20(__init__)
      999    0.005    0.000    0.361    0.000 /imagekit/specs/__init__.py:90(cachefile_name)
      999    0.014    0.000    0.340    0.000 /imagekit/cachefiles/namers.py:13(source_name_as_path)
      999    0.005    0.000    0.279    0.000 /backend/imagegenerators.py:165(get_hash)
      999    0.006    0.000    0.273    0.000 /imagekit/hashers.py:28(pickle)
      999    0.001    0.000    0.259    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:220(dump)
17982/999    0.056    0.000    0.257    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:269(save)
      999    0.003    0.000    0.253    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:591(save_list)
      999    0.002    0.000    0.241    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:608(_batch_appends)
      999    0.005    0.000    0.209    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:345(save_reduce)
      999    0.004    0.000    0.119    0.000 /imagekit/hashers.py:18(save_dict)
      999    0.007    0.000    0.104    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:655(_batch_setitems)
      999    0.001    0.000    0.083    0.000 /imagekit/cachefiles/__init__.py:82(url)
      999    0.004    0.000    0.082    0.000 /imagekit/cachefiles/__init__.py:72(_storage_attr)
    12987    0.039    0.000    0.068    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:227(memoize)
      999    0.002    0.000    0.052    0.000 /imagekit/specs/__init__.py:232(get_spec)
      999    0.005    0.000    0.052    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:532(save_tuple)
     5994    0.011    0.000    0.051    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:480(save_string)
      999    0.003    0.000    0.050    0.000 /imagekit/registry.py:32(get)
      999    0.002    0.000    0.045    0.000 /django/core/files/storage.py:265(url)
      999    0.003    0.000    0.044    0.000 /imagekit/specs/__init__.py:86(__init__)
     2997    0.013    0.000    0.042    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:730(save_global)
      999    0.003    0.000    0.040    0.000 /imagekit/specs/__init__.py:37(__init__)
     3996    0.011    0.000    0.037    0.000 /imagekit/utils.py:27(get_by_qname)
    39960    0.034    0.000    0.034    0.000 {method 'write' of 'cStringIO.StringO' objects}
     2997    0.004    0.000    0.033    0.000 /imagekit/utils.py:49(get_singleton)
      999    0.005    0.000    0.031    0.000 /django/dispatch/dispatcher.py:147(send)
      999    0.011    0.000    0.029    0.000 /Python.framework/Versions/2.7/lib/python2.7/urlparse.py:250(urljoin)
28974/27975    0.018    0.000    0.025    0.000 {getattr}
      999    0.009    0.000    0.022    0.000 /imagekit/cachefiles/backends.py:13(get_default_cachefile_backend)
     4028    0.007    0.000    0.021    0.000 /django/utils/importlib.py:18(import_module)
      999    0.009    0.000    0.019    0.000 {method '__reduce_ex__' of 'object' objects}
     7025    0.019    0.000    0.019    0.000 {__import__}
      999    0.003    0.000    0.015    0.000 /imagekit/cachefiles/strategies.py:37(__init__)
      999    0.003    0.000    0.014    0.000 /pilkit/utils.py:120(suggest_extension)
    12987    0.011    0.000    0.013    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:250(put)
     1998    0.006    0.000    0.013    0.000 /Python.framework/Versions/2.7/lib/python2.7/urlparse.py:136(urlparse)
      999    0.002    0.000    0.013    0.000 /django/utils/encoding.py:218(filepath_to_uri)
      999    0.001    0.000    0.012    0.000 /imagekit/registry.py:52(existence_required_receiver)
     1000    0.007    0.000    0.012    0.000 /python2.7/posixpath.py:321(normpath)
      999    0.003    0.000    0.012    0.000 /imagekit/registry.py:55(_receive)
      999    0.005    0.000    0.010    0.000 /django/dispatch/dispatcher.py:212(_live_receivers)
    41970    0.010    0.000    0.010    0.000 {method 'get' of 'dict' objects}
      999    0.008    0.000    0.010    0.000 /python2.7/copy_reg.py:59(_reduce_ex)
     1998    0.006    0.000    0.009    0.000 /python2.7/posixpath.py:68(join)
     1998    0.002    0.000    0.009    0.000 /python2.7/posixpath.py:104(splitext)
     1998    0.003    0.000    0.008    0.000 /imagekit/specs/__init__.py:97(source)
      999    0.001    0.000    0.007    0.000 /imagekit/utils.py:135(call_strategy_method)
      999    0.001    0.000    0.007    0.000 /pilkit/utils.py:104(format_to_extension)
    46953    0.007    0.000    0.007    0.000 {id}
     1998    0.005    0.000    0.007    0.000 /python2.7/genericpath.py:85(_splitext)
      999    0.002    0.000    0.006    0.000 /django/utils/encoding.py:138(force_bytes)
    20979    0.006    0.000    0.006    0.000 {repr}
     2997    0.003    0.000    0.006    0.000 /django/core/files/base.py:32(__nonzero__)
     3999    0.004    0.000    0.006    0.000 /django/conf/__init__.py:51(__getattr__)
      999    0.005    0.000    0.005    0.000 {sorted}
      999    0.003    0.000    0.005    0.000 /django/db/models/fields/files.py:151(__get__)
     1998    0.002    0.000    0.005    0.000 /django/dispatch/dispatcher.py:9(_make_id)
    11011    0.005    0.000    0.005    0.000 {isinstance}
     1998    0.003    0.000    0.005    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:437(save_int)
     1998    0.003    0.000    0.005    0.000 /Python.framework/Versions/2.7/lib/python2.7/urlparse.py:167(urlsplit)
     5995    0.004    0.000    0.004    0.000 {hasattr}
        1    0.000    0.000    0.004    0.004 /pilkit/utils.py:28(_preinit_pil)
        1    0.002    0.002    0.004    0.004 /PIL/Image.py:313(preinit)
      999    0.002    0.000    0.004    0.000 /Python.framework/Versions/2.7/lib/python2.7/urllib.py:1245(quote)
     3996    0.003    0.000    0.003    0.000 {method 'rindex' of 'str' objects}
    17982    0.003    0.000    0.003    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:333(persistent_id)
     3996    0.003    0.000    0.003    0.000 {method 'startswith' of 'unicode' objects}
17043/17041    0.003    0.000    0.003    0.000 {len}
     1000    0.003    0.000    0.003    0.000 {method 'encode' of 'unicode' objects}
     1998    0.002    0.000    0.003    0.000 <string>:8(__new__)
     1998    0.001    0.000    0.003    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:426(save_none)
      999    0.003    0.000    0.003    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:173(__init__)
     2997    0.003    0.000    0.003    0.000 /django/core/files/base.py:29(__bool__)
     7074    0.002    0.000    0.002    0.000 {method 'append' of 'list' objects}
      999    0.001    0.000    0.002    0.000 /django/utils/functional.py:202(inner)
      999    0.002    0.000    0.002    0.000 {_hashlib.openssl_md5}
      999    0.000    0.000    0.002    0.000 /imagekit/utils.py:58(autodiscover)
     2003    0.002    0.000    0.002    0.000 {method 'split' of 'str' objects}
     3996    0.002    0.000    0.002    0.000 {method 'rfind' of 'unicode' objects}
     1000    0.002    0.000    0.002    0.000 /pilkit/utils.py:61(_format_to_extension)
     4031    0.002    0.000    0.002    0.000 {method 'startswith' of 'str' objects}
      999    0.002    0.000    0.002    0.000 {method 'rstrip' of 'str' objects}
      999    0.001    0.000    0.002    0.000 /Python.framework/Versions/2.7/lib/python2.7/pickle.py:430(save_bool)
      999    0.001    0.000    0.002    0.000 /Python.framework/Versions/2.7/lib/python2.7/urlparse.py:222(urlunparse)
      999    0.002    0.000    0.002    0.000 /django/dispatch/saferef.py:171(__call__)
     1000    0.001    0.000    0.002    0.000 /python2.7/posixpath.py:59(isabs)
      999    0.002    0.000    0.002    0.000 {method 'split' of 'unicode' objects}
      999    0.001    0.000    0.001    0.000 {method 'hexdigest' of '_hashlib.HASH' objects}
     2000    0.001    0.000    0.001    0.000 {built-in method __new__ of type object at 0x104d49338}
      999    0.001    0.000    0.001    0.000 {method 'values' of 'dict' objects}
      999    0.001    0.000    0.001    0.000 /django/utils/functional.py:221(__setattr__)
      999    0.001    0.000    0.001    0.000 /pilkit/processors/resize.py:198(__init__)
      999    0.001    0.000    0.001    0.000 {method 'join' of 'unicode' objects}
      999    0.001    0.000    0.001    0.000 /imagekit/specs/__init__.py:107(source)
     1016    0.001    0.000    0.001    0.000 {method 'join' of 'str' objects}
     1003    0.001    0.000    0.001    0.000 {method 'replace' of 'str' objects}
      999    0.001    0.000    0.001    0.000 {issubclass}
Proper-Job commented 10 years ago

Hi Matt

This continues to be a problem, even though we migrated to linux servers. In a single json feed we generate hundreds of urls. This has a big impact on response time.

Thank you and greetings Moritz

matthewwithanm commented 10 years ago

Sorry, I don't quite follow your previous benchmarks. It looks like your custom get_hash is still pickling the processor when we were trying to compare it to something that just used a few properties?

In any case, it seems pretty certain from your previous benchmark that the pickling is the issue, right? In that case, the solution is to use a custom hasher. Currently, that means overriding get_hash on your specs (or, since the hash is currently only used to generate a filename, to provide a custom IMAGEKIT_SPEC_CACHEFILE_NAMER that doesn't call get_hash), but what we probably should do is create a setting for it, like IMAGEKIT_CACHEFILE_NAMER, which would allow you to change the hashing algorithm globally.

pcompassion commented 9 years ago

I got hit by this as well..

I just defined ImageSpecField and accessed it's url property.

I'm not sure my case is anyway related to get_hash..

matthewwithanm commented 9 years ago

@pccompassion It's likely that you're just using a cache strategy that generates the image (or verifies it's existence) on access. Check out the optimization section of the docs.

Proper-Job commented 7 years ago

I believe I have finally found the problem causing the performance issues described in this thread. I have created a pull request here: https://github.com/matthewwithanm/django-imagekit/pull/417 As it turns out, it wasn't the get_hash implementation at all. The problem is that imagekit.utils.autodiscover is being run every time a ImageSpec is instantiated.