AcademySoftwareFoundation / OpenImageIO

Reading, writing, and processing images in a wide variety of file formats, using a format-agnostic API, aimed at VFX applications.
https://openimageio.readthedocs.org
Apache License 2.0
1.98k stars 597 forks source link

[FEATURE REQUEST]Read as little as possible from image during 'open' call #2902

Open DougRogers opened 3 years ago

DougRogers commented 3 years ago

When scanning a drive for images, I am interested in gathering all images that are larger than a specific size. During the open call, more info than just the header is read in, slowing the call significantly. Calling open on thousands or millions of images is very time-consuming.

For example, in

auto in = ImageInput::open (filename);
if (! in)
    return;
const ImageSpec &spec = in->spec();
int xres = spec.width;
int yres = spec.height;

the TGA processing reads in the thumbnail data.

I would like to have all pixel processing, even the thumbnails, removed from the open call and just have the minimal spec information read so that this call is performant.

lgritz commented 3 years ago

This is an interesting topic, and it's not obvious what the right approach is, so I am definitely interested in feedback from you and anybody else reading.

So, just as background, a common theme in OIIO's APIs is that the whole image can be separated into the pixel data itself, and everything else is stored in ImageSpec. The idea is that the result of a full read cycle is an ImageSpec + a pixel buffer, and so when you write an image, you just pass an ImageSpec and a pixel buffer. The specific thing we tried to avoid was having to read item after item through separate API calls, have the user code carefully tend to those separate pieces of data, and pass them though a succession of special API calls when writing or copying one image to another.

So for that reason, I historically had hesitated to make a separate call to read the thumbnail, or to write it. And it's definitely a priority for me that the simple "read an image from one file, write it out to a different file" be really, really simple, a small number of simple API calls.

But I also see that there are times when you don't want all that expense. And thumbnails aren't the only case like this: ICC profiles can be quite big. PTex files have a big set of information giving the connectivity of the mesh. OpenEXR just merged a patch that adds (enormous!) metadata that associates a manifest of object names with integer "object ID" images. So I think this issue will keep popping up in various forms and that there is merit to addressing it intelligently.

So I've been mulling this over for a few days, and I can think of a few strategies we might take here. These are not necessarily mutually exclusive and are in no particular order:

Maybe there are other approaches, too. I would love feedback on this matter.

lgritz commented 3 years ago

Another one I thought of:

fpsunflower commented 3 years ago

There's definitely lots of options, and not all of them are mutually exclusive.

The hint to "skip big metadata" seems like the easiest path forward to me, though it is fairly limited to this particular use case. If you wanted to later get the metadata for a file you had already opened this way, you would need to close it and re-open it "for real" right? The definition of what counts as "big" would also be up to the individual file formats to honor, which could be tricky to homogenize across the format plugins, and hard for users of the library to reason about.

If we had the luxury of rethinking the API some more I would probably argue for making all metadata load on demand (while still keeping provisions for round-tripping everything properly when re-saving files). But this seems like it would be really hard to retrofit on top of existing code like you said.

The issue of how the manage thumbnails specifically seems a bit different to me and worth thinking about separately. We should have some way of addressing the thumbnail in a file as similarly as possible to other pixel data. I'm not sure that using the subimage concept works well - don't some formats support both thumbnails and subimages (exr?). This could get confusing for a user wanting to (say) loop over the subimages in a file, not realizing one of them is the thumbnail. A hint could be given in the subimage's spec (or main file spec) but you would have to know to check for it.

So maybe its worth adding a specific "thumbnail image" concept in the API? Similar to subimages, but addressed as their own thing. That would be an API change though, and break any existing code that manipulates thumbnails via the metadata API.

This is a tricky one!

lgritz commented 3 years ago

If you wanted to later get the metadata for a file you had already opened this way, you would need to close it and re-open it "for real" right? The definition of what counts as "big" would also be up to the individual file formats to honor, which could be tricky to homogenize across the format plugins, and hard for users of the library to reason about.

Yeah, I am assuming that you would need to close and re-open. This features is for the common case when you only need the basics (think: iinfo *.tif, you just want the list of files and resolutions).

One simple unambiguous dividing line is the hard coded fields of ImageSpec (the things that tell you the "shape" of the image, like resolution, nchannels, etc.) versus everything that would go in extra_attribs. In fact, it could correspond to the ImageSpec::copy_dimensions call?

If we had the luxury of rethinking the API some more I would probably argue for making all metadata load on demand (while still keeping provisions for round-tripping everything properly when re-saving files). But this seems like it would be really hard to retrofit on top of existing code like you said.

Maybe not so hard? Most existing code asks for ImageInput::spec() and then pulls out fields from there. We could make the implementation of spec() read the whole thing (un-lazify it), which would make existing code work as-is. And add a new call for retrieving an individual piece of metadata, which would load upon being called (if it wasn't already loaded). So the only programs that have to change are the ones that want to selectively pull things out.

The issue of how the manage thumbnails specifically seems a bit different to me and worth thinking about separately. We should have some way of addressing the thumbnail in a file as similarly as possible to other pixel data.

Thumbnails may indeed be a special case. I'm not averse to adding a single ImageInput call "give me thumbnail", and ImageOutput call "here is the thumbnail". It could even pass ImageBuf's around, very convenient. The only downside to this is that old programs would drop them on the floor, unless updated to make the extra calls. But do we care? Almost none of the formats we use in VFX support thumbnails (OpenEXR does not, BTW).

Grepping... Currently, Targa and PSD are our only format readers that put thumbnails in the metadata! (RAW should, but it's an unimplemented "FIXME".) And we don't have a PSD writer. So the only thing that could get lost at this moment is an app (not part of OIIO, because we would fix that) reading Targa and then writing it back out, losing the thumbnail. Why would you read a Targa and then write a Targa? You're not transcoding. Maybe you're modifying the image itself? But if you do that, the thumbnail is very likely no longer correct anyway.

So maybe thumbnails deserve their own thing. But that does nothing to solve things like the ID manifest. Though maybe that's not a real problem, either, considering they probably aren't present in any file that the reader wouldn't want that info, I don't know.

Shootfast commented 3 years ago

I would definitely vote for the opt-out of metadata approach, either via the "no big things" or "list of things to ignore" mode. Being at the begining of the pipeline I want everything possible to come along for the ride, but I can see how people at the tail end would just want the pixels.

Almost none of the formats we use in VFX support thumbnails (OpenEXR does not, BTW).

https://github.com/AcademySoftwareFoundation/openexr/blob/master/src/lib/OpenEXR/ImfPreviewImageAttribute.h ? Unless you meant within OIIO, in which case yes, our OpenEXR plugin doesn't currently support reading or writing thumbnails.

The only downside to this is that old programs would drop them on the floor, unless updated to make the extra calls. But do we care?

I would love support for both reading, writing, and autogenerating thumbnails from within OIIO, and I don't think older apps not having the API support would matter too much. Nobody is currently screaming about the lack of support. A separate API for the thumbnails makes sense to me (definitely not treated as a subimage).

DougRogers commented 3 years ago

Another option is to do what FBX does. You specify everything you wish to read upfront via flags.

    if (GetFBXImporter()->IsFBX())
    {
        FbxIOSettings *ioSettings = GetFBXSDKManager()->GetIOSettings();

        IOS_REF.SetBoolProp(IMP_FBX_CONSTRAINT, false);
        IOS_REF.SetBoolProp(IMP_FBX_CONSTRAINT_COUNT, false);
        IOS_REF.SetBoolProp(IMP_FBX_MATERIAL, true);
        IOS_REF.SetBoolProp(IMP_FBX_TEXTURE, true);
        IOS_REF.SetBoolProp(IMP_FBX_LINK, true);
        IOS_REF.SetBoolProp(IMP_FBX_SHAPE, true);
        IOS_REF.SetBoolProp(IMP_FBX_GOBO, false);
        IOS_REF.SetBoolProp(IMP_FBX_ANIMATION, true);
        IOS_REF.SetBoolProp(IMP_FBX_GLOBAL_SETTINGS, true);
        IOS_REF.SetBoolProp(IMP_FBX_CURRENT_TAKE_NAME, true);
    }

  lStatus = GetFBXImporter()->Import(GetFBXScene());

FbxIOSettings Class Reference

lgritz commented 3 years ago

@shootfast Right you are! I had forgotten that OpenEXR has a thumbnail option. I guess OIIO's lack of it for the past 13 years has not gotten anybody upset enough to even point out that I never got around to it.

@DougRogers Yeah, that's akin to the configuration hint about what to read. I think the correct default is to read everything, but an exclusion list makes sense, and there can be an alias for "exclude everything but the dimensional info".

So I think this is an initial plan for what seems to be resonating (both with you guys as well as in my own head):

For later, we can contemplate more lazy behavior. One thing I was thinking is that since all user-side access to the metadata is through the spec() call, we could add an optional parameter to spec() that says "I want it all" or "just the basics". The default would be "all", preserving old behavior. Calling with "all" would do the extra work if not already done (it still would only happen once), and the initial open would only provide the basics. Of course, any other API calls would call spec(all) internally if their operations needed to know the rest. This, too, is changing the call signatures, so would not be able to be backported. But I think it is straightforward.

DougRogers commented 3 years ago

Of course, any other API calls would call spec(all) internally if their operations needed to know the rest. This, too, is changing the call signatures, so would not be able to be backported. But I think it is straightforward.

Perhaps you could add an optional spec config argument that defaults to 'all' to maintain compatibility.

lgritz commented 3 years ago

@DougRogers Yes, exactly what I had in mind.

nrusch commented 3 years ago

I'm a little late to the party here, but I like the idea of an optional "exclusion" list, and I'm curious what you're thinking in terms of API.

My assumption after reading through this thread is that exclusions would be specified as names. If so, it may be a good idea (or even necessary) to support pattern matching of some kind, especially if you're doing something like examining a large number of images of unknown provenance (similar to the original issue description), that, for whatever reason, may use inconsistent metadata keys.

That use-case then made me wonder if the additional option would actually make more sense as an include filter (or if both include and exclude modes could be supported), although I still think pattern-matching would be important in that scenario. Am I off in the weeds here?

lgritz commented 3 years ago

The most general approach could be maybe "thing1,thing2,thing3" or "all,-thing4,-thing5" so it covers both inclusion and exclusion. But is a fine-grained control really needed?

I'm still mulling this over in my head and I even started sketching a prototype last night and now I am less sure of some of this. We already mentioned thumbnails as deserving their own API calls and no longer being part of the metadata. There are very few other cases in very few formats where there might be a concrete performance benefit to excluding particular pieces of metadata. I can see it being a big bummer for all the readers to be fully plumbed with extra complexity and regex checking for every metadata item. We wouldn't want to slow down the usual case of wanting everything for the sake of slightly speeding up the very rare case of wanting a specific subset of things. I think the relatively common case (once we have excluded thumbnails) is wanting nothing beyond the dimensional information. Maybe that's all we should do for now, get the first stage done, and then see what appetite there is for more?

So after some fooling around with it last night and sleeping on it, now I'm thinking like this:

DougRogers commented 3 years ago

I would add some items that should be part of the minimal set:

lgritz commented 3 years ago

If you look at the ImageSpec definition, there are hard fields for most of those things -- resolution, channels, data type, tile size, whether it's a deep file, etc. Then everything else, all the "arbitrary metadata" gets shoved in the extra_attribs list.

ImageSpec even has two copy methods, operator= which copies everything (including a deep copy of the whole extra_attribs paramlist), and copy_dimensions() which omits extra_attribs and the channel names and is very inexpensive.

So that's a pretty natural dividing line.

lgritz commented 3 years ago

It starts to get muddy if you want to ask for the presence or size of a piece of expensive metadata (the metametadata?), but don't want the data itself. It's a lot less clear how often (across items and file formats) you would be able to know those things without incurring most of the expense of just getting the whole thing.

DougRogers commented 3 years ago

I would exclude those from the basic data and just include that type of data if requested.

lgritz commented 3 years ago

There are several COMMON cases:

We've got those covered already.

We also have a concrete proposal for these less common but plausible cases:

We can dismiss the cases where you want to exclude a subset of the small cheap things, because there's no benefit to adding extra logic to skip them. You would never be able to measure the perf gain, and you might even come out behind if the cost of checking item by item was more than just copying the small items.

With thumbnails disposed of, how many cases of large/expensive metadata are there? I can think of only a few cases -- ICC profiles maybe? Name manifests in OpenEXR objectID files? Mesh connectivity in PTex files? Anything else? It's not clear how often there exists any file that has more than one of them, and if there are, when you would want one of those but not the other, and if you did, whether there would be any measurable perf benefit to being that fine-grained.

You have to also understand that for many of the file formats, the underlying reader library may have already gone to the expense of reading the whole header up front as soon as it opens, so imagining that we are saving significant expense by not asking for a few items is closing the barn door after the horse is long gone.

DougRogers commented 3 years ago

You could have several levels of data.

1) Simple, size and spec items 2) Med. Thumbnails 3) Image 4) large meta-data

lgritz commented 3 years ago

I had forgotten about this, but in looking at imageio.h, I just realized that the ImageInput already has separate spec() and a spec_dimensions() calls, with the latter returning an ImageSpec copy that only has the important fields filled in. It seems like without breaking the interface, we could easily make it so that most of the metadata is not read until spec() is called or until some other internal functionality needs other metadata. If all the user does is open() and then get spec_dimensions(), it would never worry about most of the metadata. So the simple 2-stage lazy metadata reading is already half plumbed and could even be backported without breaking APIs.

nrusch commented 3 years ago

All great points. In that light, the "K.I.S.S." changes you outlined make good sense to me.

alister-chowdhury commented 3 years ago

Just because I haven't seen anyone mention it yet as a pretty big usecase: Generally when streaming in tiles (especially from many different textures), I don't believe there is any need at all for reading any metadata beyond the layout (planar/separate if relevant), format, dimension and mip/subimage count.

So much so, that in things I've set up to deal with mipmapping, actually attempt to clear as much arb metadata as possible (SHA-1, Comments, Commandline etc) to prevent any additional reading during rendering.

DougRogers commented 3 years ago

TIFF is reading a lot of data during open().

TIFFInput::readspec() is reading decoding xmp data

// Search for an XML packet containing XMP (IPTC, Exif, etc.)
int xmlsize         = 0;
const void* xmldata = NULL;
if (TIFFGetField(m_tif, TIFFTAG_XMLPACKET, &xmlsize, &xmldata)) {
    // std::cerr << "Found XML data, size " << xmlsize << "\n";
    if (xmldata && xmlsize) {
        std::string xml((const char*)xmldata, xmlsize);
        decode_xmp(xml, m_spec);
    }
}
MrKepzie commented 2 years ago

Bumping this - I also encountered PNGs that are very slow to open due to XMP metadata. Here's a sample file attached HUD_LowerLeft.png.zip