martin-marek / hdr-plus-swift

📸Night mode on any camera. Based on HDR+.
https://burst.photo
GNU General Public License v3.0
206 stars 10 forks source link

[Feature Request] - Support for non-Bayer CFAs #2

Open SZim92 opened 2 years ago

SZim92 commented 2 years ago

Would love to see support for X-Trans based sensors in this program and in hdr-plus-pytorch, although I realize it is a bit further down the priority list.

Users may also be interested in seeing support for Quad Bayer, RYYB, Nonacell, and Foveon X3 (among others).

martin-marek commented 2 years ago

Hmm. Are any of those sensor types supported by DNG?

My idea for this app was to make is support as many cameras as possible without any loss of image quality during image reading / writing. This is not possible with processed image formats (jpeg, png,..) nor libraries like libraw (you loose too much image quality during reading / writing). DNG is a standardized format that solves these problems.

So I definitely want to stick with DNG. The question is: can I do anything to make DNG work with these sensor types?

Btw, hdr-plus-pytorch is based on libraw and thus achieves much lower image quality than hdr-plus-swift.

martin-marek commented 1 year ago

The latest version (1.2) should have basic support for X-Trans – can you try it out?

The output resolution is not going to be optimal (because the mosaic patter is 6x6 instead of 2x2) but this could be improved over time. For now, you should at least get the same noise reduction as Bayer senosors.

Alex-Vasile commented 1 year ago

Re: Foveon sensors.

The SD Quattro (normal and H) as well as the DP Quattros can write to DNG files (some examples from the SD Quattro: https://www.dpreview.com/opinion/9977248883/sigma-shoots-dng-raw). These use a 4-1-1 sensor (B-G-R) so each output pixel has it's own blue pixel on the sensor, but the red and green are shared. So allignment will have to be handled different for the different layers.

The other cameras, e.g. DP Merrill will only write to .x3f but there is a third party tool to convert them into DNG format (https://github.com/Kalpanika/x3f). These use a 1-1-1 sensor, so each output pixel has full RGB data in the raw. These will have to be treated essentially the same as an already demosaiced raw (i.e. #7 will need to be fixed to work rather than throw an error.).

martin-marek commented 1 year ago

I already tested Fuji X-Trans .RAF files. In Burst Photo 1.2, these files should work, including conversion, alignment, and merging.

@Alex-Vasile is right that demosaiced raw files are currently not supported.

Alex-Vasile commented 1 year ago

I already tested Fuji X-Trans .RAF files. In Burst Photo 1.2, these files should work, including conversion, alignment, and merging.

@Alex-Vasile is right that demosaiced raw files are currently not supported.

For the record, they aren't "demosaiced" since the point of the sensor is to avoid the issues with that.

Which part of the code is dependant on them being non-demosaiced?

martin-marek commented 1 year ago

Currently, when reading / writing DNG files, we only load / store data into the "stage1" image – i.e. the mosaiced image.

It would be nice to only work with demosaiced images – this would improve the output resolution. The DNG SDK can demosaic images but I belive it's too slow (since it runs on the CPU). It would be nice if Burst Photo could read both "stage1" images and demosaic them on the GPU and "stage2" images, which are already demosaiced.

Alignment+merging would then need to be slightly modified, but that's not too much work.

Lastly, we would need to save stage2 images using the DNG SDK.

All of this is doable, I'm just not sure how much work it would take.

chris-rank commented 1 year ago

With support of the Bayer pattern and Fuji X-Trans, I think you already covered >99% of cameras out there (mobile phones excluded). Foveon and other exotic formats are very rare. To improve the quality of X-Trans, you may have a look at the X-Trans pattern. When you reduce the mosaic pattern size to 3x3, the resulting pattern is not fully symmetric, but it should be suitable for conversion to gray-scale and thus for alignment of the tiles. This should improve the output resolution without fundamental changes of your app.

martin-marek commented 1 year ago

Thanks @chris-rank, great suggestion! I've just implemented the change.

chris-rank commented 1 year ago

I think I have an idea how already demosaiced DNGs (RGB pixels) and Foveon 1-1-1 data RGB maybe easily processed by the app. It would only require a remapping of the data to a texture with 2x2 the resolution. For each 2x2 pixel block in that texture, we would add the R, G, B value as well as the mean value of the three values as the 4th pixel value. From that point on, the processing would (hopefully) be equal to the standard process. After the merging, another remapping back to the original file structure would be required. The main difficulty I see is to detect all these "special" cases to apply the re-mapping, e.g. detect different mobile phones and the Foveon sensor from the input DNGs.

martin-marek commented 1 year ago

Sounds great! I think the read_image function in dng_utils/dng_sdk_wrapper.cpp should be able to return mosaic_pettern_width = 1 when mosaic_info == NULL or negative->fMosaicInfo->fCFAPatternSize.h == 1. Then the align_and_merge function in align.swift could handle this as a special case, similar to Fuji X-Trans.

I could have easily missed something but I guess something similar should work.

martin-marek commented 1 year ago

An alternative approach might be to store all textures as .rgba16Float instead of .r16Float (i.e. have 4 channels per pixel so the texture resolution would halve in each dimension). I think Metal can work faster with a single rgba pixel than four r pixels, if we always work with all four channels together.

Alex-Vasile commented 1 year ago

An alternative approach might be to store all textures as .rgba16Float instead of .r16Float (i.e. have 4 channels per pixel so the texture resolution would halve in each dimension). I think Metal can work faster with a single rgba pixel than four r pixels, if we always work with all four channels together.

If RGBA is constantly queried and used together, then storing them sequentially in memory should result in a speedup since the CPU should have an easier time predicting what to load in memory and require fewer lookups. (But you'll have to profile to be sure)

chris-rank commented 1 year ago

The .rgba16Float textures are faster in the case the 4 channels are processed independently. I actually convert the Bayer textures to rgba at a certain point as the Fourier transform becomes faster then. Metal supports SIMD instructions, which can do the same calculation on a float4 vector for all 4 entries...

martin-marek commented 1 year ago

Great. So would it make sense to load both RGB (demosaiced / Foveon 1-1-1) and Bayer RAW images as .rgba? (Ignoring X-Trans for now).

Alex-Vasile commented 1 year ago

Great. So would it make sense to load both RGB (demosaiced / Foveon 1-1-1) and Bayer RAW images as .rgba? (Ignoring X-Trans for now).

I think that might need a bit more thought: