AcademySoftwareFoundation / OpenImageIO

Reading, writing, and processing images in a wide variety of file formats, using a format-agnostic API, aimed at VFX applications.
https://openimageio.readthedocs.org
Apache License 2.0
1.97k stars 597 forks source link

[BUG] Sony ARW medium and small compression 15 bit per sample files are unsupported #4361

Open lingyukongt opened 3 months ago

lingyukongt commented 3 months ago

Describe the bug

Importing a Sony ARW Medium or Small compression file with 15 bits per sample is unsupported in OIIO.

Other Sony ARW Medium or Small compression files which are not 15 bits per sample open correctly and are supported.

OpenImageIO version and dependencies

oiiotool --buildinfo
OIIO 2.5.12.0 | MacOS/ARM
    Build compiler: Apple clang 15.0 | C++14/201402
    HW features enabled at build: neon
Dependencies: Boost 1.85.0, BZip2 1.0.8, FFmpeg 6.0, fmt 10.2.1, Freetype
    2.13.2, GIF 5.2.2, Libheif 1.17.6, libjpeg-turbo 3.0.3, LibRaw 0.21.2,
    OpenColorIO 2.3.2, OpenEXR 3.2.4, OpenVDB NONE, PNG 1.6.43, pugixml 1.14,
    pybind11 2.12.0, Python 3.12.3, Robinmap, TBB 2021.12.0, TIFF 4.6.0, WebP
    1.4.0, ZLIB 1.2.12

To Reproduce

Steps to reproduce the behavior:

  1. Download the ARW file
  2. Try to convert to a different format with OIIO
  3. This error appears.
  4. I expected it to output.
oiiotool /Users/topazlabs/Downloads/A1A04973.ARW -o ~/Downloads/arw.png
oiiotool ERROR: read : Could not open file "/Users/topazlabs/Downloads/A1A04973.ARW", Unsupported file format or not RAW file
Full command line was:
> oiiotool /Users/topazlabs/Downloads/A1A04973.ARW -o /Users/topazlabs/Downloads/arw.png

Evidence

https://www.dropbox.com/scl/fi/y19jqq4q3rh1wv7810048/A1A04973.ARW?rlkey=pnrdm0oo0hlpbkep1goczk1da&st=ij4br4bn&dl=0

lgritz commented 2 months ago

We rely on libraw to decode that file type. I think that maybe this means that libraw itself doesn't handle that bit depth for that file?

I don't think there's a lot we can directly do until libraw supports it. Maybe you should file an issue against that project?

LibRaw commented 2 months ago

What version of LibRaw do you use? Sony Small/Medium Pseudo-RAW format is supported since 202403 snapshot published in our github repo on Mar 29, 2024. Please upgrade if you're using an older version.

Here is sample above processed with dcraw_emu -T -w: https://www.dropbox.com/scl/fi/wm2ef2atvy8korg1nn32z/A1A04973.ARW.tiff?rlkey=4yt4so6osoevy74m129ka9p1d&dl=0

lgritz commented 2 months ago

Hi, @LibRaw , thanks so much for chiming in.

According to the log posted above, the OP was using LibRaw 0.21.2. I'm not sure exactly how that corresponds to the snapshot you refer to.

LibRaw commented 2 months ago

LibRaw 0.21.0 was released in 2022. At the time of this release, this format did not yet exist in Sony cameras.

0.21.xx is bugfixes only, no new cameras/formats to preserve API/ABI.

Our release policy is described in details in project description on Github and/or on project homepage.

lingyukongt commented 2 months ago

Initial testing for reproduction was done with homebrew oiiotool which is version 2.5.12.0 with libraw 0.21.2 as you said, but our application is using 2.4.11.0 internally with Libraw 20240710. Initial post was using the latest OIIO from homebrew since it was also reproducing the not being able to save issue, and our build system does not build oiiotool.

I'm not sure if a crash related to this was fixed between OIIO versions, but here is part of the crash report for reference.

Thread 17 Crashed:: QThread
0   libsystem_platform.dylib               0x18714e230 _platform_memmove + 144
1   libOpenImageIO.2.4.11.dylib            0x10d69c378 OpenImageIO_v2_4::convert_pixel_values(OpenImageIO_v2_4::TypeDesc, void const*, OpenImageIO_v2_4::TypeDesc, void*, int) + 240
2   libOpenImageIO.2.4.11.dylib            0x10d7b50f8 OpenImageIO_v2_4::RawInput::read_native_scanline(int, int, int, int, void*) + 1220
3   libOpenImageIO.2.4.11.dylib            0x10d693290 OpenImageIO_v2_4::ImageInput::read_native_scanlines(int, int, int, int, int, void*) + 180
4   libOpenImageIO.2.4.11.dylib            0x10d6933ac OpenImageIO_v2_4::ImageInput::read_native_scanlines(int, int, int, int, int, int, int, void*) + 180
5   libOpenImageIO.2.4.11.dylib            0x10d692ba4 OpenImageIO_v2_4::ImageInput::read_scanlines(int, int, int, int, int, int, int, OpenImageIO_v2_4::TypeDesc, void*, long long, long long) + 1328
6   libOpenImageIO.2.4.11.dylib            0x10d695da0 OpenImageIO_v2_4::ImageInput::read_image(int, int, int, int, OpenImageIO_v2_4::TypeDesc, void*, long long, long long, long long, bool (*)(void*, float), void*) + 1684

If a crash related to this was fixed we can try to update to OIIO 2.5 internally, but this will take us some time.

LibRaw commented 2 months ago

@lingyukongt Please make sure you use actual LibRaw public header files (libraw.h) when building the wrapper you use (OpenImageIO):

LibRaw snapshots are NOT binary (ABI) compatible unless such compatibility is declared (e.g. within single major version: 0.21.0, 0.21.1, 0.21.x). Internal layout may change without notice, while our C++ api do not isolate this.

Alternatively, use LibRaw C API, that provides such isolation.

LibRaw commented 2 months ago

Just looked into raw.imageio/rawinput.cpp, read_native_scanline (HEAD/master fetched from github)

The issue is solved: the read_native_scanline reads the imgdata.rawdata.raw_image pointer (without checking it for NULL value before).

But raw_image is actual for Bayer images only, while Sony YCC/Pseudo-RAW is 3-color image, so color3_image or color4_image will be used: https://www.libraw.org/docs/API-datastruct-eng.html#libraw_rawdata_t

So, please update OpenImageIO to support 3/4 color images. Also, check imgdata.rawdata.raw_image against NULL is strongly suggested.

lgritz commented 2 months ago

We're closing in on releasing 3.0 (which is currently still called "2.6" in the master branch), and 2.5 is the supported release family. So 2.4 is 2 years old and we're no longer investigating problems in it. We haven't issued any patches for 2.4 for quite some time now, and since that predated any 2024 era libraw releases or snapshots, I would consider that untested and not a combination I would recommend.

@libraw:

I see now, 0.21.2 dates from January, so that is not the snapshot and thus doesn't contain support for the format in question. The new snapshot would be labelled 0.22.x, right? We do CI test against libraw master nightly, in addition to tagged releases, so I am confident that we build and run against your latest snapshots.

lgritz commented 2 months ago

@antond-weta Calling your attention to @LibRaw's comments above about Bayer images, since you seem to be currently working in our libraw code.

lingyukongt commented 2 months ago

Thanks for the investigation guys

LibRaw commented 2 months ago

@lgritz Please take look on our update policy: https://www.libraw.org/#updatepolicy (same on GitHub page, but I do not know how to point to specific sections).

Our snapshots are carefully tested against normal user case (no specifically crafted files e.g. from fuzzers, but normal files from user cameras). These snapshots are published as master branch and not tagged (it does not make sense, see below). After such publishing, only very minor fixes are added (including ones inspired by OSS-fuzz and similar).

So, LibRaw/master is generally OK for end-user cases: no unknown data sources (e.g. fuzzed/specially crafted files) just files from user's camera. It is not recommended to use in services opened to everyone (e.g. processing anonymous users uploads).

LibRaw commented 2 months ago

BTW, null pointer access is a good reason for an immediate fix (and CVE)

lgritz commented 2 months ago

We CI test against a range of libraw releases, as well as whatever the current master is.

But we don't supply pre-built binaries, nor strictly dictate what version of libraw that users/builders compile against (other than enforcing a minimum version, which is currently 0.18 in OIIO 2.5 but in our master was recently raised to 0.20). The judgment about which version of LibRaw to build against, or any feature vs reliability tradeoffs in that decision, is up to the downstream builder of OIIO. I think most of them just use whatever is the latest tagged release provided by their OS or packaging system of choice. So in general, I would only expect the latest snapshots to be used by people who make a proactive decision to build them from source and understand the tradeoffs.

lgritz commented 2 months ago

@LibRaw I really appreciate your input here, thanks.

LibRaw commented 2 months ago

This null-pointer-access problem is not Sony/YCC specific. There are multiple non-bayer/full-color raw formats, e.g: Linear DNG, Kodak YCC, Kodak RGB, fast-load DNG, Sony ARQ, Sony YCC, Canon sRAW, Nikon Small RAW.

Such files will be processed OK in LibRaw part and will fail on null-pointer access in OIIO layer.

lgritz commented 2 months ago

Understood, thanks. I'm hoping this is something that @antond-weta will roll into his next set of changes.

antond-weta commented 2 months ago

I'll take a look, thanks!