AcademySoftwareFoundation / OpenImageIO

Reading, writing, and processing images in a wide variety of file formats, using a format-agnostic API, aimed at VFX applications.
https://openimageio.readthedocs.org
Apache License 2.0
1.97k stars 597 forks source link

[BUG] 2.5.10.1 significantly slower than 2.5.9.0 #4375

Closed p0las closed 2 months ago

p0las commented 2 months ago

Describe the bug

both compiled in exact same way on windows 10 with the same dependencies. oiiotool -- version takes 20s+ in 2.5.10.1 (and all newer including 2.6 builds)

it is likely that we compile the newer versions incorrectly and maybe we were meant to update something that we didn't. We went from a very old 2.4 to the latest and noticed the slowdown. then we compiled a lot of versions to find the one that introduced the issue. 2.5.9 is the last usable for us. everything newer is very slow on windows 10. Looking at the release notes of 2.5.10 there is nothing obvious that changed that could explain the slowness.

OpenImageIO version and dependencies

OIIO 2.5.9.0 | unknown arch?
    Build compiler: MSVS 1938 | C++14/199711
    HW features enabled at build: sse2
Dependencies: Boost 1.80.0, BZip2 1.0.8, DCMTK NONE, FFmpeg 4.4, fmt 9.1.0, Freetype 2.12.1, GIF NONE, Libheif 1.13.0, libjpeg-turbo 2.1.4, LibRaw NONE, OpenColorIO 2.1.2, OpenCV NONE, OpenEXR
    3.1.5, OpenJPEG NONE, PNG 1.6.38, Ptex NONE, Ptex NONE, Robinmap, TBB NONE, TIFF 4.4.0, WebP NONE, ZLIB 1.2.13
OIIO 2.5.10.1 | unknown arch?
    Build compiler: MSVS 1938 | C++14/199711
    HW features enabled at build: sse2
Dependencies: Boost 1.80.0, BZip2 1.0.8, DCMTK NONE, FFmpeg 4.4, fmt 9.1.0, Freetype 2.12.1, GIF NONE, Libheif 1.13.0, libjpeg-turbo 2.1.4, LibRaw NONE, OpenColorIO 2.1.2, OpenCV NONE, OpenEXR
    3.1.5, OpenJPEG NONE, PNG 1.6.38, Ptex NONE, Ptex NONE, Robinmap, TBB NONE, TIFF 4.4.0, WebP NONE, ZLIB 1.2.13

To Reproduce

Steps to reproduce the behavior:

  1. compile 2.5.10 or newer on windows 10 using vcpkg

cmake -DCMAKE_TOOLCHAIN_FILE=c:/DEV/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows -DCMAKE_PREFIX_PATH=c:\DEV\vcpkg\installed\x64-windows -DUSE_PYTHON=0 -DUSE_QT=0 -DOIIO_BUILD_TESTS=0 ..

cmake --build . --config Release --clean-first

  1. oiiotool --version
lgritz commented 2 months ago

That's very unexpected!

HELP WANTED!!!

I don't have access to a Windows machine. Can anybody else reproduce this?

Two things I would recommend as the next steps to investigate:

  1. I know it's a bit of trouble, but there are only 20 commits between 2.5.9 and 2.5.10, and 7 of them are docs, administrative, or only change the CI (so can't be responsible). Can somebody bisect that span and determine exactly which commit is the one where things slowed down?

  2. Can anybody run this in a profiler and identify where this time is being spent?

The fact that this happens with oiiotool --version means it must be initialization of something static, since it's not reading or writing any images at all.

None of the commits between 2.5.9 and 2.5.10 look like they should slow anything down, so I'm really confused.

p0las commented 2 months ago

after updating vspkg and recompiling with more recent dependencies it seems to be fast again.

this one is good:

OIIO 2.5.10.1 | unknown arch?
    Build compiler: MSVS 1938 | C++14/199711
    HW features enabled at build: sse2
Dependencies:  , Boost 1.85.0, BZip2 1.0.8, DCMTK NONE, FFmpeg 6.0, Freetype NONE, GIF NONE, Libheif NONE, libjpeg-turbo 3.0.3, LibRaw NONE, OpenColorIO 2.2.1, OpenCV NONE, OpenEXR 3.2.4,
    OpenJPEG NONE, PNG NONE, Ptex NONE, Ptex NONE, Robinmap, TBB NONE, TIFF 4.6.0, WebP NONE, ZLIB 1.3.1
lgritz commented 2 months ago

I'd like to hold this open a bit longer, in the hopes that we can understand what's going on.

You're still using OIIO 2.5.10.1, but your dependencies changed from

Boost 1.80.0, BZip2 1.0.8, DCMTK NONE, FFmpeg 4.4, fmt 9.1.0, Freetype 2.12.1, GIF NONE, Libheif 1.13.0, libjpeg-turbo 2.1.4, LibRaw NONE, OpenColorIO 2.1.2, OpenCV NONE, OpenEXR 3.1.5, OpenJPEG NONE, PNG 1.6.38, Ptex NONE, Ptex NONE, Robinmap, TBB NONE, TIFF 4.4.0, WebP NONE, ZLIB 1.2.13

to (my highlighting changes):

Boost 1.85.0, BZip2 1.0.8, DCMTK NONE, FFmpeg 6.0, Freetype NONE, GIF NONE, Libheif NONE, libjpeg-turbo 3.0.3, LibRaw NONE, OpenColorIO 2.2.1, OpenCV NONE, OpenEXR 3.2.4, OpenJPEG NONE, PNG NONE, Ptex NONE, Ptex NONE, Robinmap, TBB NONE, TIFF 4.6.0, WebP NONE, ZLIB 1.3.1

whew, that's a lot of changes.

I am still a little suspicious of OpenColorIO here, because we do have quite a bit of behind-the-scenes changes in logic depending on what version of OCIO we encounter -- 2.1 and 2.2 may be quite different. So that's still my leading candidate, but it's purely intuition, not because I have any convincing evidence.

lgritz commented 2 months ago

Also, regarding OCIO... do you by any chance have an OCIO config in your environment?

I do recall at some point discovering that for certain OCIO versions and certain old OCIO configs with certain LUTs that are expensive to invert, our startup inventory and categorization of the color spaces got to be quite expensive. I think we fixed that, but I don't recall precisely when that occurred.

p0las commented 2 months ago

yes. I had a simple OCIO v1 config. config-redshift-aces_cg.zip

a lot of dependencies in the first version were old as I re-used an existing dev environment. I probably should have updated the dependencies but it was meant to be a quick job :-).

lgritz commented 2 months ago

Aha, ok, I think (without strong evidence) that this is probably another case of the combination of some known inefficiencies when using older OCIO 2.x with certain expensive (LUT-based) OCIO 1.x configs.

Even so, many seconds is too long, and if somebody can reproduce it in a meaningful way, isolate to a particular commit (if it changed at some point), or make any headway on exactly where it's spending the time, we'd certainly still like to look at it to ensure it's really fixed and doesn't recur. Let's reopen the issue if that occurs.

But if @p0las is now fine after their upgrade and nobody else can reproduce, I think we should move on since we're close to a new release that dramatically brings up the minimum OCIO version that we're using. So will close the issue now.