DY112 / LSMI-dataset

[ICCV'21] Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm under Mixed Illumination
https://www.dykim.me/projects/lsmi
54 stars 3 forks source link

The blacklevel confusion of Sony sensor & Missed saturation clipping after wb #12

Closed shuwei666 closed 1 year ago

shuwei666 commented 1 year ago

I found the black level of Sony from the .arw raw file, which is $512$. However, it seems that your code (in 2_pre_process_data.py)is using another manual value, which is $128$.

image

Could you explain this discreprancy?

Thanks!

DY112 commented 1 year ago

Hi @shuwei666, thank you for your interest about our work. First, I had long forgotten that I had inserted that exceptional code. ~~Probably when I wrote the code, I didn't realize that it might be natural for some pixels to get negative numbers after the black level subtraction processing, and I maybe used that value(128) to prevent that. It is correct to follow the metadata values (512) for Sony cameras as well.~~ (I wrote why I used 128 black level in the following comment) I'll check to see if there are any errors in the data preprocessing, visualization, or overall code when using 512 BLACK LEVEL, and then push the fix.

shuwei666 commented 1 year ago

Thanks Dongyoung!

In addition, I noticed that your code may have a slight mistake after performing black level correction. It appears that you did not apply saturation clipping after white balancing, resulting in the issue of bright pink colors, as shown below. Since the multi-illu estimation is pixel-based, this oversight could potentially impact the final performance.

image

and here is the added code:

img_wb = img / illum_map img_wb = img_wb.clip(0, SATURATION)

Furthermore, I observed that your code in '2_pre_process_data.py' only masks the Macbeth chart in the training dataset, but not in the validation and test datasets. This could lead to unfair comparisons with other methods. For example, using a simple max-RGB method may also yield good results.

Anyway, thanks for your great work in this multi-illu estimation field. I am just hoping this minor tips could help make it better!

DY112 commented 1 year ago

I appreciate your attention to detail. However, if you trained the network with an unclipped dataset, you could simply clip the output of the network to a known white_level value as you did in your example. If we train the network using the clipped image as GT, as you say, then we should be able to use the network output as is.

For color chart masking, I understand that some classical algorithms can take advantage from color charts and the comparison may not be fair. However, if we are only comparing models based on deep learning, the assumption is that they are trained on training data, and the color chart is not visible during training, so in this case, masking only the training data should be fine.

I will add the caveats about saturation clipping and color chart masking to the README.

Thanks again!

DY112 commented 1 year ago

Hi again @shuwei666 , I visualized the sony raw data with some simple code, and I can see why I set the black level (128) exceptionally for the sony data. In the process of processing the image as a tiff file using DCRAW, rather than the original RAW image, the galaxy and nikon data have no change in pixel values, but the sony data have their pixel values scaled by roughly 1/4. Therefore, the black level of the original RAW (.ARW), 512, must be used instead of 128 to process the image correctly.

raw = rawpy.imread('Place208_1.arw') tiff_file = 'Place208_1.tiff'

white_level = raw.white_level black_level = raw.black_level_per_channel[0] wb_kernel = raw.camera_whitebalance wb_kernel = (np.array(wb_kernel) / wb_kernel[1])[0:3]

raw_data = cv2.imread(tiff_file, cv2.IMREAD_UNCHANGED).astype(float)

black_subtracted = np.clip(raw_data - black_level, 0, white_level-black_level)

white_balanced = black_subtracted * wb_kernel[None, None, :] white_balanced = np.clip(white_balanced, 0, white_level) auto_bright = white_balanced / white_balanced.max()

gamma_corrected = np.power(auto_bright, 1/2.2)

print('tiff file min/max : ', raw_data.min(), raw_data.max()) # 109, 4157 print('arw file min/max : ', raw.raw_image.min(), raw.raw_image.max()) # 438, 16628 print('White level: ', white_level) print('Black level: ', black_level) print('White balance kernel: ', wb_kernel) print("Raw data min/max: ", raw_data.min(), raw_data.max()) print("Black subtracted min/max: ", black_subtracted.min(), black_subtracted.max()) print("White balanced min/max: ", white_balanced.min(), white_balanced.max())

resize to 1/4

gamma_corrected = cv2.resize(gamma_corrected, (0,0), fx=0.25, fy=0.25) cv2.imwrite('Place208_1.png', (gamma_corrected*255).astype(np.uint8))

shuwei666 commented 6 months ago

Hi again,

This Sony sensor seems a bit odd. I noticed it is a 14-bit depth sensor, but if that's the case, its black level should be closer to 512. However, according to your analysis, 128 is what allows for correct image output, which suggests that you might have used a 12-bit mode for saving. In other words, the effective bit depth of this Sony sensor's RAW data is actually 12 bits, right?

DY112 commented 6 months ago

I don't know what is a 12-bit saving mode. Is it a mode of DCRAW? or Sony camera? As I explained in the last answer, the RAW data (values from arw file) is a 14 bit data, but after we convert arw to tiff using DCRAW, the tiff data became 12-bit data. (1/4 scaled)

shuwei666 commented 6 months ago

You can check the mode If you have the Sony alpha 9 camera.

It seems that the Sony Alpha 9 camera is operating in a 12-bit depth mode since the black level at 128 and saturation point at 4095 correspond to a 12-bit range.

Another possibility is that the camera might not truly possess the 14-bit depth capability as advertised. A histogram analysis(255bins) of the camera's raw output reveals that pixel values are predominantly within the lower range, up to approximately 50 (50 * 64 = 3200~4095). Conversely, there are significantly fewer occurrences of higher values. This pattern suggests that the actual effective bit-depth achieved by the camera is closer to 12-bit, contrary to the claimed 14-bit.

That's may the reason why you tried blacklevel as 512 but failed!

Bingo!

image

image image