Automatic scaling of Image.MAXBLOCK

acdha commented 11 years ago

There's a venerable old PIL bug which caused Image.save(optimize=True) to fail with JPEGs when the file size is over a certain threshold. The earliest reference I could find dates back to 1999:

http://mail.python.org/pipermail/image-sig/1999-December/000942.html

The workaround commonly used is to either hardcode something like this:

ImageFile.MAXBLOCK = <empirically determined constant (i.e. wild guess)>

or the more dynamic approach used in e.g. jdriscoll/django-imagekit#50:

ImageFile.MAXBLOCK = max(img.size) ** 2

This seems like exactly the kind of thing Pillow could just fix once and for all.

aclark4life commented 11 years ago

Do you have a preference? And/or do you want to send a pull request? @cgohlke @wiredfool any thoughts?

wiredfool commented 11 years ago

I'd really like to fix a 14 year old bug. (Sounds like a fine whiskey actually. older than some decent ones.)

I'd have to dig to see what MAXBLOCK affects. I'd be especially wary if maxblock ends up affecting buffer sizes that get filled and passed to the lib* decoders.

Good (or at least some) test coverage of this would be nice, and go a long way toward mitigating any bad effects.

So, it looks like there's a comment in RawEncode.c

 /* FIXME: This encoder will fail if the buffer is not large enough to
 * hold one full line of data.  There's a workaround for this problem
 * in ImageFile.py, but it should be solved here instead.
*/

and in ImageFile.py

    # FIXME: make MAXBLOCK a configuration parameter
    bufsize = max(MAXBLOCK, im.size[0] * 4) # see RawEncode.c

My Wild Ass Guess is that when libjpeg is optimizing, it passes back the whole image in one big chunk. Otherwise, worst case it can do scanline by scanline, and the bufsize guarantees that it's big enough for one scan line.

If we bump up the MAXBLOCK, or the bufsize here, we're going to allocate that much memory each time we do an encode. The brute force method would be size[0]size[1] * 4, but that's going to be suboptimal for large images. max(size)*2 would be less than that, but it may not actually fix all the cases.

I'd be willing to dig more, assuming I can get a reproducable test case.

acdha commented 11 years ago

Strong +1 on expanding tests. I strongly suspect your guess is correct and this could have ripple effects with a codebase this old and squirrelly.

If you want to reproduce it, here's a transcript:


Python 2.7.2 (default, Oct 11 2012, 20:14:37) 
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from PIL import Image, ImageFile
>>> i = Image.new("RGB", (4096, 4096), 0xff3333)
>>> i.save("/dev/null", format="JPEG")
>>> i.save("/dev/null", format="JPEG", optimize=True)
Suspension not allowed here
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/cadams/.virtualenvs/project/lib/python2.7/site-packages/PIL/Image.py", line 1437, in save
    save_handler(self, fp, filename)
  File "/Users/cadams/.virtualenvs/project/lib/python2.7/site-packages/PIL/JpegImagePlugin.py", line 471, in _save
    ImageFile._save(im, fp, [("jpeg", (0,0)+im.size, 0, rawmode)])
  File "/Users/cadams/.virtualenvs/project/lib/python2.7/site-packages/PIL/ImageFile.py", line 501, in _save
    raise IOError("encoder error %d when writing image file" % s)
IOError: encoder error -2 when writing image file
>>> ImageFile.MAXBLOCK = max(i.size) ** 2
>>> i.save("/dev/null", format="JPEG", optimize=True)
>>>

Ideally, this feels like something which should just work. Perhaps something as simple as having save() check the format and optimize flags and increase MAXBLOCK for that operation along with documenting that this can increase RAM consumption? The part which feels dirtiest is changing this by setting a variable on a class – simply moving the change closer to the function call seems like a good idea.

wiredfool commented 11 years ago

So, having poked around a bit, I can confirm a few things.

It does appear that the jpeg code is writing out optimized images in one push to the buffer.
The "Suspension not allowed here" message basically means that it tried to write, and there wasn't enough space.
That really boring image takes 8k/megapixel to save, one from my camera has a different constant there. In each case, MAXBLOCK had to be > than the final image file size.
MAXBLOCK appears to be allocated for every image encode, so just randomly upping it is a bad idea.
Passing in an open file instead of a filename doesn't change anything, because it's just doing the shuffling to and from the buffer in c instead of python, not pushing the fp all the way down into the library. (which I did with the tiff lib, in sort of an aggressive manner. )
It's probably not threadsafe to just change maxblock for the one operation, since it's a module level global variable.
I'm still thinking on this, but I think pushing the bufsize into an optional parameter on _save, initialized to MAXBLOCK, and then figuring out some sort of guess for the final jpeg size would be an appropriate fix.

from PIL import Image, ImageFile

def search(i):
  ImageFile.MAXBLOCK = 65536
  while True:
    try:
      i.save("/tmp/jpg", format="JPEG", optimize=True)
      print "Save succeded at MAXBLOCK=%d" % ImageFile.MAXBLOCK
      break
    except: 
      print "Save failed at MAXBLOCK=%d" % ImageFile.MAXBLOCK
      ImageFile.MAXBLOCK *= 2

def im(size=(1024,1024)):
  i = Image.new("RGB", size, 0xff3333)
  return i

for x in [128,512,1024,2048,4096,8192, 16384]:
  print "%d wide" %x
  search(im((x,2048)))

for y in [128,512,1024,2048,4096,8192, 16384]:
  print "%d tall" %y
  search(im((2048,y)))

wiredfool commented 11 years ago

It looks like the two plugins need special handling with maxblock are the JpegImagePlugin and the RawEncode.c. Unfortunately, the RawEncodes go straight into the save instead of through an image plugin, so there's no place to hang the hook to up the buffer size if necessary. So, there's still a bit of a special case in there, but at least the JpegImagePlugin can override as necessary.

d-schmidt commented 11 years ago

The only problem I've had with maxblock so far was an image withh an exif header >64k. Will check the fix on monday,

wiredfool commented 11 years ago

This probably won't fix that case, as it's pretty specifically targeted at the optimized case. We might be able check the size of the chunks that we have and make sure MAXBLOCK is big enough to handle that.

wiredfool commented 11 years ago

@d-schmidt did you ever check the jpeg image plugin with a >64k exif size?

d-schmidt commented 11 years ago

What do you mean by "check"? We are using it, I'm increasing maxblock manually in the image converter. The converter is running 24/7.

wiredfool commented 11 years ago

It looks like it's actually pretty easy to put in a check for the exif size. I don't have an image to check for it, and looking for something lead me to this: https://en.wikipedia.org/wiki/Exif#Viewing_and_editing_Exif_data

Exif metadata are restricted in size to 64 kB in JPEG images because according to the specification this information must be contained within a single JPEG APP1 segment. Although the FlashPix extensions allow information to span multiple JPEG APP2 segments, these extensions are not commonly used. This has prompted some camera manufacturers to develop non-standard techniques for storing the large preview images used by some digital cameras for LCD review. These non-standard extensions are commonly lost if a user re-saves the image using image editor software, possibly rendering the image incompatible with the original camera that created it.

Anyway, if you could test a4a856b we could get a fix merged. Or, rather 7200c40 which actually passes travis on py3.

d-schmidt commented 11 years ago

Next week. Nikon seems to ignore the specification. Or it is a multiple segment spanning block, I don't know. But the raw binary exif block was above 65k on a lot of images.

d-schmidt commented 11 years ago

Your fix does not work so far. The exif is smaller than the limit (-4 bytes) but it still crashes. It is working just fine if I add ImageFile.MAXBLOCK = 2**20 before saving.

Suspension not allowed here
Traceback (most recent call last):
  File "converter.py3", line 254, in shrinkImage
    im.save(outFile, "JPEG", quality=quali, exif=orgInfo['exif'])
  File "e:\programme\Python32\lib\site-packages\pillow-2.0.0-py3.2-win32.egg\PIL\Image.py", line 1467, in save
    save_handler(self, fp, filename)
  File "e:\programme\Python32\lib\site-packages\pillow-2.0.0-py3.2-win32.egg\PIL\JpegImagePlugin.py", line 568, in _save
    ImageFile._save(im, fp, [("jpeg", (0,0)+im.size, 0, rawmode)], bufsize)
  File "e:\programme\Python32\lib\site-packages\pillow-2.0.0-py3.2-win32.egg\PIL\ImageFile.py", line 476, in _save
    raise IOError("encoder error %d when writing image file" % s)
IOError: encoder error -2 when writing image file
encoder error -2 when writing image file

analyzation mode started
loaded file:  e:\temp\testbilder_nikon_exif\aimg9.jpg
used parameters:  {'analyze': '', 'in': 'testbilder_nikon_exif\\aimg9.jpg'}

size width x height:  4288x2848
internal format:  JPEG
color mode:  RGB
contains exif:  True
exif length:  65532 bytes
contains icc profile:  False
transparency flag:  None

jpeg with exif found, trying jhead:
File name    : e:\temp\testbilder_nikon_exif\aimg9.jpg
File size    : 942957 bytes
File date    : 2012:09:26 22:09:45
Camera make  : NIKON CORPORATION
Camera model : NIKON D90
Date/Time    : 2012:08:21 00:19:35
Resolution   : 4288 x 2848
Flash used   : Yes (auto, return light detected)
Focal length : 38.0mm  (35mm equivalent: 57mm)
Exposure time: 0.017 s  (1/60)
Aperture     : f/5.0
ISO equiv.   : 800
Whitebalance : Auto
Metering Mode: pattern
GPS Latitude : ? ?
GPS Longitude: ? ?

wiredfool commented 11 years ago

4 bytes. (I'm guessing that those bytes are APP1.) So, they're in spec, and we're not handling it. I wonder if the JPEG lib needs to write all the metadata in one shot, or if it's just one block. More digging required, I guess. I'll take a look and see what I can find in the libjpeg code.

MrKesn commented 11 years ago

The same problem. Even https://github.com/python-imaging/Pillow/issues/158 and https://github.com/python-imaging/Pillow/commit/7200c40 did not work for me - I had to delete condition at all:

if "optimize" in info:
    bufsize = im.size[0]*im.size[1]

transformed into

bufsize = im.size[0]*im.size[1]

I can provide any additional information

wiredfool commented 11 years ago

@MrKesn Do you have a test image that you can send me?

MrKesn commented 11 years ago

@wiredfool No problem! Here is my city: https://dl.dropboxusercontent.com/u/3471307/a1838ba836207eb8194535df1b21241c.jpg Here are my cats: https://dl.dropboxusercontent.com/u/3471307/fork.jpg They are from different sources but both of them fail as usual with

Exception Type: IOError
Exception Value:    
encoder error -2 when writing image file
Exception Location: local/lib/python2.7/site-packages/PIL/ImageFile.py in _save, line 466

wiredfool commented 11 years ago

@MrKesn I've opened and saved those images with Pillow, and it works for me. Can you narrow it down to a small test case where it fails with the default MAXBLOCK setting?

aclark4life commented 11 years ago

So are we close to understanding this? Would love to get a fix into 2.1.0

wiredfool commented 11 years ago

I think we're close. I suspect that we just need a factor larger than 1*Exif size, but I don't know what that factor is. If it turns out that we really need to do the whole image in one shot, then I don't think the approach in 7200c40 is going to work. (Id' say it's useful anyway, but I'm not so sure since it's only going to be triggered if the exif is out of spec, and we haven't actually gotten a sample of an image like that.)

I can still dig, but I'd need a sample image from @d-schmidt that fails. Or, if he can bisect and figure out what the smallest maxblock is that allows the image to save.

d-schmidt commented 11 years ago

from PIL import Image
from PIL import ImageFile

ImageFile.MAXBLOCK = 65536
im = Image.open("test.jpg")
im.save("exfi_result.jpeg", "JPEG", quality=90, exif=b"1"*65532)

You can easily test this this way.

edit:

from PIL import Image
from PIL import ImageFile

i = 65536
b = True
while b:
    try :
        ImageFile.MAXBLOCK = i
        i += 1
        im = Image.open("test.jpg")
        im.save("test_out.jpg", "JPEG", quality=90, exif=b"1"*65532)
        b = False
    except IOError: pass
print(i)

66161 - 65536 = 625, it needs to be large enough to contain the complete header, incl Huffmann and quantisation tables.

edit3: somewhere in the jpegencode.c should be a stop I think, like the one for the extra data. So it can write out loop again.

wiredfool commented 11 years ago

Ok, I've gone through this and I think I have it all in my head.

ImageFile._save calls encode multiple times to write out the file. encode is actually _encode in encode.c. It allocates a buffer of bufsize each time through the loop, and calls the specific encoder, in this case, the one in jpegencode.c.

JpegEncode.c, in case 2, calls jpeg_start_compress, which according to the docs will emit the first few bytes of the JPEG datastream header. Then it calls jpeg_write_marker with the exif info. If there's not enough space, boom, we fail.

[edit] Right. So.

The start_compress writes 20 bytes.
Writing the exif takes 4 bytes of headroom for the exif info, + 1 more
We need at least 604 bytes to write the first call to jpeg_write_scanlines, as checked through a small and large jpeg. I guess this could probably increase depending on the header size. I'm going to require 10 k in the c code, to reduce any edge cases there. The penalty for missing that is to loop back around with a fresh buffer. Any image that has 65k of exif data is going to need to loop through, might as well make sure we cover the cases.

As an aside, I think I'd like to refactor Imagefile._save and _encode so that encode does the fp.write. Later though.

wiredfool commented 11 years ago

Ok, new version up here: https://github.com/wiredfool/Pillow/commit/37359369ceaea5e888762f0e4e77fcf91fda4c8c

@d-schmidt: I've included the test from above, I'd like to see if it works in the real world for you.

d-schmidt commented 11 years ago

@wiredfool this is working for my real images with the full size nikon exif block.

Who changed the freefont stuff :/ had to update my freefont lib.

aclark4life commented 11 years ago

Yay! Re: 3735936. No idea about "freefont stuff" what is that?

d-schmidt commented 11 years ago

I don't know. I got a bunch of compiler errors and had to update the Freefont lib to be able to compile the current master.

aclark4life commented 11 years ago

OK I'll look through the commits at some point

d-schmidt commented 11 years ago

The commit must have been after 2013/04/09. If this is any important? Had version 2.3.5 and I'm now using 2.4.2.

And the errors:

C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\BIN\cl.exe /c /nologo /Ox /MD /W3 /GS- /DNDEBUG -IlibImaging -Ie:\downloads\tk\tcl8.5.10\generic -Ie:\downloads\jpeg-8d-vc9\jpeg-8d\lib -Ie:\downloads\tk\tk8.5.10\generic -Ie:\downloads\tk\tk8.5.10\xlib -Ie:\downloads\zlib-1.2.3-lib\lib -Ie:\downloads\freetype-2.3.5-1-lib\lib -Ie:\downloads\lcms-1.19\include -Ie:\programme\Python32\include -Ie:\programme\Python32\include -Ie:\programme\Python32\PC /Tc_imagingft.c /Fobuild\temp.win32-3.2\Release\_imagingft.obj
_imagingft.c
_imagingft.c(226) : error C2275: 'FT_BBox' : illegal use of this type as an expression
        e:\downloads\freetype-2.3.5-1-lib\lib\freetype/ftimage.h(107) : see declaration of 'FT_BBox'
_imagingft.c(226) : error C2146: syntax error : missing ';' before identifier 'bbox'
_imagingft.c(226) : error C2065: 'bbox' : undeclared identifier
_imagingft.c(227) : error C2275: 'FT_Glyph' : illegal use of this type as an expression
        e:\downloads\freetype-2.3.5-1-lib\lib\freetype/ftglyph.h(87) : see declaration of 'FT_Glyph'
_imagingft.c(227) : error C2146: syntax error : missing ';' before identifier 'glyph'
_imagingft.c(227) : error C2065: 'glyph' : undeclared identifier
_imagingft.c(228) : error C2065: 'glyph' : undeclared identifier
_imagingft.c(228) : warning C4047: 'function' : 'FT_Glyph *' differs in levels of indirection from 'int *'
_imagingft.c(228) : warning C4024: 'FT_Get_Glyph' : different types for formal and actual parameter 2
_imagingft.c(229) : error C2065: 'glyph' : undeclared identifier
_imagingft.c(229) : warning C4047: 'function' : 'FT_Glyph' differs in levels of indirection from 'int'
_imagingft.c(229) : warning C4024: 'FT_Glyph_Get_CBox' : different types for formal and actual parameter 1
_imagingft.c(229) : error C2065: 'bbox' : undeclared identifier
_imagingft.c(229) : warning C4133: 'function' : incompatible types - from 'int *' to 'FT_BBox *'
_imagingft.c(230) : error C2065: 'bbox' : undeclared identifier
_imagingft.c(230) : error C2224: left of '.yMax' must have struct/union type
_imagingft.c(231) : error C2065: 'bbox' : undeclared identifier
_imagingft.c(231) : error C2224: left of '.yMax' must have struct/union type
_imagingft.c(232) : error C2065: 'bbox' : undeclared identifier
_imagingft.c(232) : error C2224: left of '.yMin' must have struct/union type
_imagingft.c(233) : error C2065: 'bbox' : undeclared identifier
_imagingft.c(233) : error C2224: left of '.yMin' must have struct/union type
_imagingft.c(336) : error C2143: syntax error : missing ';' before 'type'
_imagingft.c(344) : error C2065: 'temp' : undeclared identifier
_imagingft.c(345) : error C2065: 'temp' : undeclared identifier
_imagingft.c(346) : error C2065: 'temp' : undeclared identifier
_imagingft.c(366) : error C2143: syntax error : missing ';' before 'type'
_imagingft.c(368) : error C2065: 'xx' : undeclared identifier
_imagingft.c(369) : error C2065: 'x0' : undeclared identifier
_imagingft.c(370) : error C2065: 'x1' : undeclared identifier
_imagingft.c(371) : error C2065: 'xx' : undeclared identifier
_imagingft.c(372) : error C2065: 'x0' : undeclared identifier
_imagingft.c(372) : error C2065: 'xx' : undeclared identifier
_imagingft.c(373) : error C2065: 'xx' : undeclared identifier
_imagingft.c(373) : error C2065: 'x1' : undeclared identifier
_imagingft.c(374) : error C2065: 'x1' : undeclared identifier
_imagingft.c(374) : error C2065: 'xx' : undeclared identifier
_imagingft.c(382) : error C2065: 'xx' : undeclared identifier
_imagingft.c(384) : error C2065: 'x1' : undeclared identifier
_imagingft.c(385) : error C2065: 'x0' : undeclared identifier
_imagingft.c(402) : error C2065: 'xx' : undeclared identifier
_imagingft.c(403) : error C2065: 'x0' : undeclared identifier
_imagingft.c(403) : error C2065: 'x1' : undeclared identifier
error: command '"C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\BIN\cl.exe"' failed with exit status 2

wiredfool commented 11 years ago

I bet that is the descender bug. #185

e98cuenc commented 11 years ago

This bug is not completely dead. The same patch that was applied in "Pull Request #158: JpegImagePlugin sets bufsize for optimized images" also needs to be applied for progressive images.

When the jpeg encoder sees the flags optimize or progressive (or progression) it will write the full image in one shot.

The bufsize needs to be big enough to hold the entire image. The current heuristic is that the entire compressed image will fit in width * height bytes, but this heuristic is only applied to save operations with the flag "optimize" and not to save operations with the flag "progressive".

aclark4life commented 11 years ago

Can you send a pull request?

matthewwithanm commented 11 years ago

@wiredfool Wow, great work here! All together, is this correct for the minimum value?

max(len(exif) + 5, im.size[0] * 4, im.size[0] * im.size[1])

EDIT: oops, had an extra "2*" in there.

dursk commented 10 years ago

I can confirm with @e98cuenc that the bug still exists for 'progressive' images.

This line:

if "optimize" in info:
    bufsize = im.size[0]*im.size[1]

Needs to change to:

if "optimize" or "progressive" in info:
    bufsize = im.size[0]*im.size[1]

In: PIL/JpegImagePlugin.py

hugovk commented 10 years ago

@mmadurski Do you have a redistributable test image which reproduces the problem?

dursk commented 10 years ago

You can reproduce it with the same images @MrKesn posted a few posts up.

wiredfool commented 10 years ago

The current Pillow code is this:

    if "optimize" in info or "progressive" in info or "progression" in info:
        if quality >= 95:
            bufsize = 2 * im.size[0] * im.size[1]
        else:
            bufsize = im.size[0] * im.size[1]

What version are you looking at?

e98cuenc commented 10 years ago

This code can still produce a crash, but I have not been able to make it crash.

The bufsize = ... part is a heuristic. To make it bullet proof you only have to put it in a loop increasing bufsize and break as soon as it successfully saves the image.

The best test I could come up to test this was to create an image where all its pixels where random, and try to save it at maximum quality. AFAIR the compressed bytes fitted in a 2 * width * height buffer.

Even without an image to test it, I think it will be good to add a loop to increase bufsize if save fails.

Cheers,

dursk commented 10 years ago

@wiredfool Yes, you are correct. My apologies, I was looking at the specific fix that went in for this issue, and I guess it also turns out the version my team is using contains that fix, but not the one you just showed me.

agschwender commented 9 years ago

One of the users of Pilbox reported that they were receiving an encoder error -2 when writing image file error. I did some digging into it and it turned out it was receiving a Suspension not allowed here. Looking at the code it seemed as if it might be caused by a combination of quality=keep and optimize=True. Specifically, it looked as if it would not reach the condition that increased the buffer size.

if "optimize" in info or "progressive" in info or "progression" in info:
    if quality >= 95:
        bufsize = 2 * im.size[0] * im.size[1]
    else:
        bufsize = im.size[0] * im.size[1]

Here's a reference to the issue: https://github.com/agschwender/pilbox/issues/29

Apreche commented 9 years ago

I know this bug is marked as closed, but I have come across a jpg file uploaded by a user that triggered this error. I am using Python 2.7.6 and Pillow 2.8.1. The error also exists in 2.8.0 and 2.7.0.

Here are steps to reproduce.

Open the JPEG file
Extract the icc_profile data
Thumbnail the jpeg, but don't make the thumbnail much smaller than the original.
Save it as a jpeg with optimize=True, the original icc_profile, and quality=95

Expected Result: Save a thumbnailed image Actual Result: Prints the message "Suspension not allowed here" and raises an "IOError: encoder error -2"

Here is some code to reproduce the error.

from PIL import Image
from StringIO import StringIO

i = Image.open('bad.jpg')
icc_profile = i.info.get('icc_profile')
i.thumbnail((127, 275), Image.ANTIALIAS)
output = StringIO()
i.save(output, format='JPEG', quality=95, optimize=True, icc_profile=icc_profile)

Here is the offending jpeg.

offending jpeg

I think the problem is that this jpeg has a color depth of 32 bits. I did not think that was possible for a jpeg. When I re-encoded the image to be 24-bits using an image editing application, this error disappeared. The file itself clearly has a lot of problems.

Also, if the thumbnail I am making is very near to the original size of the image, the error occurs. If I make the thumbnail much smaller than the original image, the error will go away. The error can also go away if we set optimize=False, or decrease the quality setting.

I know that it is easy to just blame the problem on the bad image file and not fix this problem. However, despite the image being bad, every image editing and viewing program that I tried was able to handle this image properly. Imgur had no problems with it. Pillow even handles it properly in most cases. Certain combinations of parameters just cause it to fail.

jaddison commented 9 years ago

I've just had to downgrade to Pillow==2.7.0 to get image manipulation working with sorl-thumbnail again. I don't think we're doing anything particularly fancy with sorl.

aclark4life commented 9 years ago

@jaddison Can you elaborate on that please? Thanks

jaddison commented 9 years ago

I'm not sure how to explain it any better. With 2.8.1, I was getting quite a few "Suspension not allowed here" log messages in my supervisor.log file, along with a quite a few images that weren't getting thumbnailed. Downgrading to 2.7.0 (I'd only upgraded yesterday, and that's when the problems started) resolved the problems.

Well... most of the problems. With 2.7.0, I'm having troubles with a much smaller number of images. This one is apparently one such pain:

community-day-186239-ca25d3

Let me know if I can run some tests for you.

aclark4life commented 9 years ago

Thanks @jaddison. Just trying to document details in hopes we can find issue. I hate regressions :disappointed:

aclark4life commented 9 years ago

Since there have been so many comments, are we still talking about "Automatic scaling of Image.MAXBLOCK" or do we need a new issue?

jaddison commented 9 years ago

I have no idea! Tell me how to find out for you? :)

wiredfool commented 9 years ago

Set ImageFile.MAXBLOCK to something large and bigger than the filesize of any of your thumbnails, like 1M. Then run your tests.

paregorios commented 9 years ago

FWIW I'm seeing a very similar scenario to what @Apreche describes above (open, get icc_profile, save thumbnail with profile and optimize). Python 2.7.8. Pillow 2.8.1 (also saw it in 2.8.0).

In my case, increasing quality to 95 made no difference. Nor did removing optimize and/or progressive. It's saving the icc_profile that causes the trouble (but only on some images that I've downsized to 128x128 pixels).

UPDATE: with latest pilkit everything works fine in my case (including ICC profile), unless (for some images) I set quality=80 instead of letting defaults in the stack happen. Error is getting re-raised by pilkit when it tests MAXBLOCK and finds out its already bigger than the max value it calculates, so it would appear that something down PIL/Pillow is throwing the error for reasons other than MAXBLOCK size in my case.

Apreche commented 9 years ago

Yes, I can confirm that I'm seeing similar results as @paregorios.

I don't know much about how the internals of PIL(low) works, but is the MAXBLOCK related to space allocated for the entire image file, or just the image data portion? Could it be that although MAXBLOCK is big enough for the image data, the additional meta-data of the icc_profile puts it over the top? Perhaps adding the size of the icc_profile, and any other metadata like EXIF, to the MAXBLOCK will result in it being large enough?

paregorios commented 9 years ago

@Apreche It's my impression that pilkit's fix assumes that MAXBLOCK is related to the entire size of the image file, otherwise it wouldn't try to accommodate the size of the exif header. I see no attempt to address icc_profile size there, but I tried forcing it to in a local copy (with a guess at the prefix length), but got no joy. It's possible that something else involving image size and icc profile is making the old code dragons in the basement grumpy.

wiredfool commented 9 years ago

MAXBLOCK is the size of the buffer that we're passing to libjpeg. Generally, libjpeg does things in a chunked manner. Exif info, other metadata, compression of pixels, all of it can be done in successive calls into the library. But there are certain things that have to be emitted in one shot, like each chunk of metadata (which is why the exif size shows up), each line of the image, and in some cases, the whole compressed image. There may be others, as this thread is showing.

python-pillow / Pillow

Automatic scaling of Image.MAXBLOCK #148