AcademySoftwareFoundation / OpenImageIO

Reading, writing, and processing images in a wide variety of file formats, using a format-agnostic API, aimed at VFX applications.
https://openimageio.readthedocs.org
Apache License 2.0
1.95k stars 585 forks source link

[BUG] [Python] Reading certain TIFF files crashes the Python interpreter #2547

Closed wgergely closed 7 months ago

wgergely commented 4 years ago

Describe the bug Using the Python library I'm having issues with reading certain TIFF files. For instance, trying to load https://we.tl/t-ZZsr6n9Jcv crashes OpenImageIO and the python interpreter.

To Reproduce

I'm using the python bindings with Python 2.7 and tried building OpenImageIO against older and the latest libtiff.


# All of the following results in a crash with exit code 3221225477
path = './scan_20160801_0004.tif'
buf = oiio.ImageBuf(path)

buf = oiio.ImageBuf()
buf.reset(path, 0, 0)

i = oiio.ImageInput(path)

Expected behaviour

Perhaps this is a more generic point - is there any way of avoiding a crash and catching errors without pulling down the Python interpreter if anything goes south? I would expect an Exception instead of a crash.

I came across some cases with FFmpeg, png ICC profiles, or, like in this case, a TIFF with an exotic parameter that crashes OpenImageIO. I'm calling OpenImageIO from inside a Maya plug-in so this also pulls down the whole Maya session wholesale. Ouch! OpenImageIO does an amazing job but I don't know how to catch these exceptions and avoid the python interpreter crash.

Million thanks for your help in advance!

Platform information:

wgergely commented 4 years ago

I think I found a solution, apologies if this is obvious!

My mistake was that I didn't - likely still don't - fully understand the difference between ImageInput.create() and ImageInput.open(). I now see that if I pass a format (not a filename!) to ImageInput.create() I can use valid_file to check the image:


import OpenImageIO

source = 'path/to/invalid_image.tiff'
i = OpenImageIO.ImageInput.create('tif')
if not i.valid_file(source):
   raise RuntimeError('source is invalid')

Does this seems like the right thing to do?

lgritz commented 4 years ago

Well, it shouldn't be crashing in the first place. Must be a bug somewhere. I am investigating.

lgritz commented 4 years ago

create() just creates a reader capable of reading a file of that format.

open() actually opens the file and reads the header. There is also a static version of open that does both of these in one step.

Still, there is likely a bug somewhere. It should never crash (ideally), even if the file is invalid.

lgritz commented 4 years ago

So what you suggest seems like a decent workaround for the moment -- because t appears that the valid_file method is able to detect that there's something wrong, I guess, but there's a bug lurking in the open function for a certain variety of invalid file? I'll know more soon.

lgritz commented 4 years ago

I'm not able to reproduce any problems with your image. The tiff file looks ok to me. (Though it's enormous and inexplicably uses no compression.)

Can you show us a (minimal) complete python program that exhibits the crash or uncaught exception, and any error messages that it prints?

In your second example, the message where you say "I think I found a solution"... I'm not sure if you were saying (a) that valid_file() caught a problem and you threw that exception rather than crash? Or, (b) the file read correctly and completely when you did it this way?

wgergely commented 4 years ago

Many thanks for taking a look. Good to know you couldn't reproduce the problem, I suspect there's some defect with this particular file, or something else is going on that is not related to OpenImageIO. I'm sorry for the bother if so.

A was seeing a complete interpreter crash as soon as I was trying to read the image headers - but using valid_file addresses my issue, so I'll close this. I have only one small suggestion: it might be a good idea to make it clearer in the documentation that ImageInput.create() takes a format as an argument.

Many thanks again!

lgritz commented 4 years ago

Re-opening, I'm not sure we're done with this.

I'm still a little confused... is valid_file returning true, and then you are able to read the file successfully? Or is valid_file returning false, and then you are successfully avoiding reading the file and thus avoiding crashing?

create() can either take a format name -- in which case it will make an ImageInput that is capable of reading that file type only. OR it can take a filename, in which case it will try to find a type of ImageInput that is able to successfully open the file (first by trying the one that seems to be implied by the file extension, but if that fails, it will try EVERY image reader it can until it finds one that can read the file). Glancing at the docs now, I do see that this is not explained clearly, so I will try to improve the explanation.

So assuming that you ARE able to read the file when you do create(formatname)/open(filename), but it crashes when you do create(filename)/open(filename):

If create("tif") then open("foo.tif") works, that means the file is fine, and the TIFF reader is fine.

If, then, create("foo.tif") is failing... how could that happen? Well, what if somehow it was not guessing properly that this was a TIFF file, and therefore going through the procedure of trying every possible ImageInput type. And suppose one of them (not TIFF!) was broken, or had some weird bug that was crashing when encountering this file (as opposed to just erroring because it was the wrong file type)? That would explain this behavior, though I'm not sure yet how we would be falling into this case.

Do you build OIIO from source? If so, would you be willing to do a few special builds and experiments to help me track this down?

lgritz commented 4 years ago

Aside: https://github.com/OpenImageIO/oiio/pull/2551 for my proposed new wording about create. Would this have helped to clarify?

wgergely commented 4 years ago

Of course, I'm happy to help if I can. I'm on Windows 10 x64 and set up to build using Visual Studio 2015 / cmake.

Hopefully, this answers your question:

ImageInput.create('path/to/bad.tiff') # results in an interpreter crash
ImageInput.open('path/to/bad.tiff') # ditto, results in an interpreter crash

i = ImageInput.create('tif')
i.open('path/to/bad.tiff') # this also crashes

However, if I do


i = ImageInput.create('tif')
i.valid_file('path/to/bad.tiff) # returns False without a crash

I checked the image I sent you against libtiff's tiffinfo and it is not recognised (!), however, it does open in Photoshop. It could be that it is corrupted in some way that OpenImageIO doesn't like. I'm sorry, I'm out of my depth here!

Re #2551 - it is much clearer, thank you.

lgritz commented 4 years ago

Oh, that's interesting. For me, it does look like a valid TIFF file, and OIIO reads it just fine.

Do you know precisely which libtiff you are using?

Try this:

oiiotool --help

Can you paste back here the output you get near the bottom, where it says

Input formats supported: ...
Output formats supported: ...
Dependent libraries: ...

??

I want to try to build against exactly the same libtiff you are using and see if I can get closer to reproducing.

wgergely commented 4 years ago

Here we go - I rebuilt OpenImageIO RB-2.2.1 (fyi, building the tools from the master branch (v2.2.2) fails on windows). libtiff is built from source, although my results were the same when I used the vcpkg supplied version.

Input formats supported: bmp, cineon, dds, dpx, ffmpeg, fits, gif, hdr, heif, ico, iff, jpeg, jpeg2000, null, openexr, openvdb, png, pnm, psd, raw, rla, sgi, socket, softimage, targa, tiff, webp, zfile

Output formats supported: bmp, dpx, fits, gif, hdr, heif, ico, iff, jpeg, jpeg2000, null, openexr, png, pnm, rla, sgi, socket, targa, tiff, webp, zfile

Dependent libraries: FFMpeg Lavf58.29.100, gif_lib 5.1.4, libheif 1.5.1, jpeg-turbo 2.0.4/jp62, OpenJpeg 2.3.1, null 1.0, IlmBase , OpenVDB 7.0.0abi7, libpng 1.6.37, libraw 0.19.0-Beta1, LIBTIFF Version 4.1.0, Webp 1.1.0

OIIO 2.2.1 built sse2, running on 8 cores 31.9GB sse2,sse3,ssse3,sse41,sse42,avx,avx2,fma,f16c,popcnt,rdrand

lgritz commented 7 months ago

Certainly my fault, but this has languished for a long time. Since I was never able to reproduce and there hasn't been more traffic on this ticket, I'm going to close the issue for now.

Before I do, just in case it's still a problem for you, let me tell you the next steps I would try:

  1. See if you can reproduce a problem with the file in the absence of the Python layer. For example,

      iconvert problem.tif out.tif

    does that also fail? (It would need to open the file and read the whole thing in order to do its job.)

  2. If that does crash, then I would build OIIO in debug mode, and run the same command using a debugger and see if we can find out precisely WHERE it is crashing. That might be the clue we need to solve this.

If this is no longer a problem, or you don't care, you can just leave this alone. If it's still something that's bugging you and you want to try these ideas, feel free to re-open this issue and I promise I'll give it another shot to give you a hand tracking it down.