gopro / cineform-sdk

The GoPro® CineForm video codec SDK.
Apache License 2.0
275 stars 57 forks source link

Bayer format description #49

Open tthbaltazar opened 4 years ago

tthbaltazar commented 4 years ago

Hi! I've been trying to use the encoder with a bayer image. Other than the very vague description of "Raw Bayer 12-bits per component, packed line of 8-bit then line a 4-bit reminder" (CFHDTypes.h:145) I can't figure out what the format is.

Is it two pixels combined as two bytes of MSB then one byte containing the 4 bits LSB for the two? Or is it MSB for all the pixels than LSB for all the pixels? I've even tried one byte of MSB then one byte with the remaining 4 bits.

I've spent hours trying different formats and I'm always getting a cropped image with different parts overplayed in colors. Google doesn't turn up anything since this is a not a common thing.

tthbaltazar commented 4 years ago

I've made a few examples. I'm using ffmpeg for putting it into a mov, and davinci resolve to view it.

This is what it's supposed to look like. This isn't demosaiced. It's a lossless jpeg, you can see the bayer pixels when you zoom in. bw-bayer

When using interleaved 12 bit it looks like this: 12-interleaved

When separating the 4 bits to the end of the line it looks like this: 8-then-2

When using two bytes per pixel it looks like this: 16-bit

dnewman-gpsw commented 4 years ago

Look at ConvertBYR5ToFrame16s() in frame.c:5473. All the MSBs, then all the LSBs.

outB = (uint8_t )uncompressed_chunk; outB += srcrow srcwidth 4 3/2; //12-bit outN = outB; outN += srcwidth * 4;

where outB is the bytes, and outN the nibbles.

tthbaltazar commented 4 years ago

Now I'm confused as to how Cineform refers to width. Bayer images don't have color channels. Why is the width multiplied by 4 as if there were 4 channels? Should I reduce my resolution by 4 when calling CFHD_PrepareEncoderPool? Am I supposed to make a single pixel from RGGB?

dnewman-gpsw commented 4 years ago

Ultimately bayer is encoded as four difference channels, something like G1+G2, R-G', B-G' and G1-G2, this extracts redundancy and improves compression. I don't remember the resolution setup for BYR4/5 pixels, and unfortunately that is not in the sample code. However there are only two options to try, full native resolution of the monochrone/sensor resolution, or half that horizontally and vertically (the red or blue channel resolution.)

Re: "Am I supposed to make a single pixel from RGGB?" I'm not sure what you are asking here.

tthbaltazar commented 4 years ago

When using half resolution the assert fails at IsFrameTransformable in encoder.c:1601. Same with forth, and only half the width. The resolution for the monochrome is 4048x3044. All my previous example images were in this this resolution. This is also set for the mov headers.

olafmatt commented 4 years ago

For writing CineForm RAW files I have been using CFHD_PIXEL_FORMAT_BYR4 for many years now. This is basically "monochrome" 16bit little-endian integer. Just treat your Bayer pattern image as a single monochrome file and tell the encoder the real size of it. And make sure your image width is an integer multiple of 16 (which seems to the case looking at your numbers).

As for that 12bit format, it wouldn't surprise me (looking at your screenshots) if that is a "planar" format, in that it first stores a plane of the higher 8bits and then a plane of the (packed) leftover lower 4 bits.

tthbaltazar commented 4 years ago

I managed to get CFHD_PIXEL_FORMAT_BYR4 working. Thanks! I still need to figure out how to specify the white balance.

olafmatt commented 4 years ago

The white balance goes as a float[4] into the metadata using the TAG_WHITE_BALANCE tag. Those four values represent R G1 G2 B scales. However, internally the SDK then only uses the value foir G1 and ignores G2. The scales can also be used to adjust the overall white level (in case your white level is not full-scale) by just rolling that scaling into each of the white-balance scales.

Color matrix is a float[3][4] set with TAG_COLOR_MATRIX. This one is a 3x3 matrix with an extra column specifying a black-level offset. Order of calculations inside the SDK are: take out source transfer curve -> apply matrix (with white balance rolled in) -> add black level (offsets also multiplied by black-level, it seems) -> put transfer curve back on.

olafmatt commented 4 years ago

One more thing while we're talking about Bayer RAW... there is a bug in the code that makes debayer crash for files that are more than 8K wide. In Codec/DemoasicFrames.cpp:4661 there is a fix labeled "could not handle 4.5K RAW". But the fix just moved the problem to > 8K.

A simple workaround might be to replace that fixed-size buffer[] with a dynamically allocated one using 'alloca(bayer_pitch * sizeof(unsigned short));', but I'm aware of the fact that even this might fail for some ridiculously large frames that might exceed what alloca() can do.