Compressed/Uncompressed DNGs

fanckush commented 4 years ago

Hello.

so.. i'm lost again :) (as usual). I've collected some scattered information about how HDRMerge writes DNGs:

Source 1:

@DavidOliver as for darktable, i will be looking into rawspeed support for hdrmerge-produced DNG's in the coming ${timeframe}. Sad thing: we were pretty sure hdrmerge could produce uncompressed DNG's, but it does not. I do have patch, but apparently it too needs changes in rawspeed.

Originally posted by @LebedevRI in https://github.com/LibRaw/LibRaw/issues/39#issuecomment-242189469

Source two: https://discuss.pixls.us/t/hdrmerge-status-update/2009/14 https://discuss.pixls.us/t/hdrmerge-status-update/2009/15

Source 3: #86

Source 4: from HDRMerge website:

In the last revision of the DNG SDK, version 1.4, Adobe introduced the possibility of encoding the data as 16-, 24- and 32-bit floating point numbers, instead of the usual 16-bit integers. In this way, the dynamic range that can be represented with such an encoding is vastly increased. Furthermore, the floating point encoding dedicates the same number of levels to each exposure step.

What is going on here!

if floating point DNGs are what is currently used in HDRMerge (16, 24, 32), then what is the "normal" DNG type! just 16 bit int? is that the only difference? also what is meant by "the floating point encoding dedicates the same number of levels to each exposure step"?
does a floating point DNG inherently mean that it's compressed?
can someone point me to good article/wiki that explains the fundamental difference between compressed and uncompressed DNGs (not just than one is smaller than the other)
what does it mean to normalize the raw image data? Does it refer to the mathematical way we normalize vectors? or does it simply mean to scale the luminance values to perfectly fit into the selected bit depth (the available space)

I hope my questions make sense. i am confused because looking at ImagStack::compose i noticed that HDRMerge doesn't take into consideration the selected bitrate.. so after some digging i ended up here :)

LebedevRI commented 4 years ago

Source 1:

@DavidOliver as for darktable, i will be looking into rawspeed support for hdrmerge-produced DNG's in the coming ${timeframe}. Sad thing: we were pretty sure hdrmerge could produce uncompressed DNG's, but it does not. I do have patch, but apparently it too needs changes in rawspeed.

Originally posted by @LebedevRI in LibRaw/LibRaw#39 (comment)

I'm reasonably sure RawSpeed now supports all the various kinds of uncompressed DNG's. As for hdrmerge DNG's (compressed floating-point), the support was contributed in https://github.com/darktable-org/rawspeed/pull/17

What is going on here!

if floating point DNGs are what is currently used in HDRMerge (16, 24, 32), then what is the "normal" DNG type! just 16 bit int?

What is a normal DNG? (there is no such thing). If you consider the DNG's that are produced by cameras then yes, i'd say 16-bit int (Uncompressed or Lossless-JPEG) is most prevalent.

is that the only difference?

Again, depends on the baseline for the comparison.

also what is meant by "the floating point encoding dedicates the same number of levels to each exposure step"?

To me that reads as very poor attempt to explain what floating point is, as compared to scalar integer.

does a floating point DNG inherently mean that it's compressed?

Compression and data type/bitdepth are separate fields in EXIF. (BitsPerSample, SampleFormat, Compression; see https://www.adobe.com/content/dam/acom/en/products/photoshop/pdfs/dng_spec_1.4.0.0.pdf)

can someone point me to good article/wiki that explains the fundamental difference between compressed and uncompressed DNGs (not just than one is smaller than the other)

I'm not sure what you want to see there. What are the differences between vacuum-packed pillow and an unpacked one?

what does it mean to normalize the raw image data?

You have pixel values from 0.0f up to 16384.0f (e.g., assuming 14-bit input), and you divide each pixel's value by either the maximal value (16384.0f) so the entire data is in 0.0f..1.0f range, or by the white level (may be lower than the maximal pixel value), so that the good pixels are in 0.0f .. 1.0f range, and clipped pixels have values above 1.0f.

Does it refer to the mathematical way we normalize vectors?

Well, kinda.

or does it simply mean to scale the luminance values to perfectly fit into the selected bit depth (the available space)

There is no mathematical need to normalize floating-point values Enter e.g. 17892 and 17892/16384 (1.092041015625) into: https://www.h-schmidt.net/FloatConverter/IEEE754.html As it can be seen that division only changes exponent.

I hope my questions make sense. i am confused because looking at ImagStack::compose i noticed that HDRMerge doesn't take into consideration the selected bitrate.. so after some digging i ended up here :)

I really hope all internal calculations are done in 32-bit floating point, which would explain that it doesn't take "bitrate" param.

heckflosse commented 4 years ago

@fanckush

also what is meant by "the floating point encoding dedicates the same number of levels to each exposure step"?

In int encoding you have two values for the first step (0 and 1), two values for the second step (2 and 3), four values for the third step (4, 5, 6, 7), 8 values for the forth step and so on.

In float encoding you have the same number of values for each step, which also means, in 16-bit float-encoding you have less distinct values in the 2048;65535 range compare to 16-bit int, but more distinct values in the 0;1024 range compared to 16-bit int. For range 1024;2048 the number of distinct values is the same for 16-bit int and 16-bit float

https://en.wikipedia.org/wiki/Half-precision_floating-point_format#Precision_limitations_on_decimal_values_in_[0,_1]

fanckush commented 4 years ago

@heckflosse what is meant here by "step" and "value"?

i do understand that 16-bit floating point has less precision for bigger values (aka: less distinct values in the upper range) and more distinct values in the lower range compared with 16-int
but what's the problem with int? isn't int encoding linear? one value per step? or is it that steps are not linear the same way float-encoding is not linear resulting in a perfect 1:1 matching?

everything i saw in HDRMerge so far ~~uses 16-bit int~~, which personally makes sense for me because i find int linear and precise across its range.

update: i was wrong, HDRMerge uses float datatype 👍

LebedevRI commented 4 years ago

but what's the problem with int? isn't int encoding linear? one value per step?

That's precisely the problem. If you have an unsigned char, you can't put anything in it other than 0 1 2 ... 253 254 255. So if after HRD merging you ended with 1.768, you have to round, which kills the data. Also, what if you somehow ended with 270? Again, round and have huge artifacts.

KevinJW commented 4 years ago

Storage encoding (linear, gamma encoded, logarithmic) is distinctly different from the representation (integer, floating point, fixed point etc).

@heckflosse is describing what happens with a linear encoded integer representation for photographic stops, i.e. for each doubling in light, how many in between values can you represent.

in rough terms (ignoring noise, black level etc), if the lowest stop is the difference between 0 and 1 (range of 1 ~1bit or precision), then the next stop is 2 times that (2 bits), double again and you have your next stop (3 bits), each stop has more values representing that range, so as you keep doubling the range you can represent finer and finer differences, but you quickly run out of range due to the exponential increase each time.

In a floating point representation, you typically have the same number of available values per stop, but for the same storage space you lower your maximum precision.

e.g. For 16 bit half you have 11 bits of precision for normal values, which means if you you want to represent integer values you start to have issues above 2048 as you need more bits to count all possible integers, in that circumstance integer is the right storage form.

However if you care about the smaller values, and you want to represent more than just integer values then half can be better, as you will not need to round the values to the nearest integer.

In some circumstances instead of trading off integer vs float, people might choose a 'log' representation for the data this can behave quite similar to the float case, but you can play with the scaling to better represent the range and precision you need.

kmilos commented 4 years ago

More about different encoding schemes for HDR and trade-offs @KevinJW is mentioning can be found here

It seems to suggest OpenEXR 48-bit/pixel (16b half float per color) strikes a good compromise, which is also what HDRMerge can use to store in the DNG container (with DEFLATE lossless compression on top for size saving).

jcelaya / hdrmerge

Compressed/Uncompressed DNGs #191