gabrieleilertsen / hdrcnn

HDR image reconstruction from a single exposure using deep CNNs
https://computergraphics.on.liu.se/hdrcnn/
BSD 3-Clause "New" or "Revised" License
508 stars 101 forks source link

10bit HDR #13

Closed guoyejun closed 6 years ago

guoyejun commented 6 years ago

Hi,

to reconstruct the image and then show it on a 10bit display, what should I do based on the model output (float value).

for example: y_gamma = np.power(np.maximum(y_predict, 0.0), ??); // does the gamma correction needed? out = y_gamma / 1024.0f; // 1024 = 2^10 out = (out > 1024.0f ) ? 1024.0f : out; final_output (with data type 10bit int) = out;

thanks

gabrieleilertsen commented 6 years ago

Hi. The simplest case would be to simply choose an exposure, clip the brightest pixels, and perform gamma correction,

y_display = 1024*min(1, (s*y_predict)^(1/g)),

where s is an exposure scaling and g is the gamma of your display (I guess the 10-bit display need gamma correction, or can it be fed linear values?). Since an HDR display also is limited in its dynamic range, you can also use more sophisticated tone-mapping operators if you want to display more of the dynamic range of the image. For example, you can use the pfstools command line HDR applications, with the pfstmo tone-mapping library.

guoyejun commented 6 years ago

thanks, i'll check if a 10-bit display needs the gamma correction.

guoyejun commented 6 years ago

and, for the meaning of the model output, my understanding is that it is just like the nature lighting intensity of the real world, right? thanks.

gabrieleilertsen commented 6 years ago

It is supposed to be linear luminance yes, but not absolute calibration. That is, if the actual physical luminance is L cd/m^2, then the output is y_predict = s*L with some unknown scaling s.

guoyejun commented 6 years ago

thanks, and can I assume that most values of the model output is between [0.0, 1.0]? Or, at least, for the training data, the model output is between [0.0, 1.0]? thanks.

gabrieleilertsen commented 6 years ago

The range [0, 1] are the LDR pixels. Saturated regions where reconstructed information has been inferred will be > 1.

guoyejun commented 6 years ago

i see, so we even don't know the most possible biggest value of the mode output?

back to my original 10bit issue, from another perspective, there are 10bit videos which does not depend on the display (gamma correction or not), so, i'm thinking how to convert the model output to R10G10B10 which is like the equivalent of 10bit video?

maybe, i can first try with s=1, and g=2. y = 1024min(1, (sy_predict)^(1/g))

any suggestion? thanks a lot.

gabrieleilertsen commented 6 years ago

No, the maximum value will differ between scenes, just as in real-world scenes.

If you use scaling s=1, you will clip all the reconstructed information so that the result is the same as the original input LDR image. So you probably want to use s<1. For example, you could anchor a certain percentile to 1, e.g. as s=1/percentile(y_predict, 99) which means that 1% of the pixels will be clipped.

The most versatile solution would be to apply some tone-mapping before converting to 10 bits. For example, in pfstmo mentioned above you could use the display adaptive TMO (pfstmo_mantiuk08) for adapting the HDR to a certain display (you don't explicitly need to know which display, but can use a generic 'typical' display you wish to target, with a certain dynamic range etc.).