Possible bug in parametric curve handling inside cmsPipeline?

amyspark commented 2 years ago

Hi again,

I'm writing some code to generate an YCbCr ICC profile. For V2, in the YCbCr -> XYZ direction, I need to pack the full transformation in the LUT, as follows:

Demo

```cpp #include #include #include #define RESOLUTION 24 // Source: Tooms (2015), table 11.1, p.192 constexpr cmsCIExyY d65 = {0.3127, 0.3290, 1.0}; // Inverse OETF curve // // Source: basic algebra on ITU-R BT.601-7, ss. 2.6.4 constexpr std::array rec601ParametersInv = {1.0 / 0.45, 1.0 / 1.099, 0.099 / 1.099, 1.0 / 4.5, 0.081}; // OETF curve // // Source: ITU-R BT.601-7, ss. 2.6.4 const std::array rec601Parameters = {0.45, 1.099, 0, 4.5, 0.018, -0.099, 0}; void log(cmsContext ctx, unsigned int errorCode, const char *msg) { std::cerr << "context " << ctx << " error: " << errorCode << " (" << msg << ")" << std::endl; } cmsInt32Number sample(const cmsUInt16Number In[], cmsUInt16Number Out[], void *cargo) { cmsPipelineEval16(In, Out, reinterpret_cast(cargo)); return TRUE; } // From the V4 profile above EXTRACT: // - cmsSigMediaWhitePointTag // - any of the cmsSigRedTRCTag, cmsSigGreenTRCTag, cmsSigBlueTRCTag // - cmsSigChromaticAdaptationTag // and use with the YCbCr profile cmsHPROFILE createBaseRec709Profile(cmsContext ctx) { // Elle Stone's prequantized sRGB primaries const cmsCIExyYTRIPLE sRGBPrimariesPreQuantized = {{0.639998686, 0.330010138, 1.0}, {0.300003784, 0.600003357, 1.0}, {0.150002046, 0.059997204, 1.0}}; auto toneCurveInv = cmsBuildParametricToneCurve(ctx, 4, rec601ParametersInv.data()); const std::array curves = {toneCurveInv, toneCurveInv, toneCurveInv}; return cmsCreateRGBProfileTHR(ctx, &d65, &sRGBPrimariesPreQuantized, curves.data()); } void setupMetadata(cmsContext ctx, cmsHPROFILE profile) { cmsMLU *description = cmsMLUalloc(ctx, 1); cmsMLUsetASCII(description, "en", "US", "ITU-R BT.601-7 YCbCr ICC V2 profile"); cmsWriteTag(profile, cmsSigProfileDescriptionTag, description); } int main() { cmsSetLogErrorHandlerTHR(nullptr, log); auto ctx = cmsCreateContext(nullptr, nullptr); auto baseProfile = createBaseRec709Profile(ctx); cmsSaveProfileToFile(baseProfile, "srgb.icc"); auto yCbrProfile = cmsCreateLab2Profile(&d65); setupMetadata(ctx, yCbrProfile); // Strict transformation between YCbCr and XYZ cmsSetDeviceClass(yCbrProfile, cmsSigColorSpaceClass); cmsSetColorSpace(yCbrProfile, cmsSigYCbCrData); cmsSetPCS(yCbrProfile, cmsSigXYZData); cmsSetHeaderRenderingIntent(yCbrProfile, INTENT_PERCEPTUAL); // The YCbCr -> XYZ conversion goes as follows: auto yCbrPipeline = cmsPipelineAlloc(ctx, T_CHANNELS(TYPE_YCbCr_16), T_CHANNELS(TYPE_XYZ_16)); // 0. Dummy curves for the "gamma"-corrected YCbCr. auto pipeline1_M = cmsStageAllocToneCurves(ctx, T_CHANNELS(TYPE_YCbCr_16), nullptr); // 1. Chrominance channels are [-0.5, 0.5]. Adjust. // The offset is pre-applied before the transform. const std::array identity = {{1, 0, 0, 0, 1, 0, 0, 0, 1}}; const std::array offset_ycbcr_to_rgb = {0, -0.5, -0.5}; auto yCbrOffset = cmsStageAllocMatrix(ctx, T_CHANNELS(TYPE_YCbCr_16), T_CHANNELS(TYPE_YCbCr_16), identity.data(), offset_ycbcr_to_rgb.data()); // 2. YCbCr -> normalized R'G'B. Source: Wolfram Alpha, inverted matrix. const std::array ycbcr_to_rgb = {{0.916602, 1.46156, 0.0287004, 1.04248, -0.744473, -0.358755, 1., 4.93315e-17, 1.772}}; auto yCbrMatrix = cmsStageAllocMatrix(ctx, T_CHANNELS(TYPE_YCbCr_16), T_CHANNELS(TYPE_RGB_16), ycbcr_to_rgb.data(), NULL); // 2. Normalized R'G'B -> linear RGB. Source: ITU-R BT.601-7 ss. 2.6.4 auto trc = reinterpret_cast(cmsReadTag(baseProfile, cmsSigRedTRCTag)); const std::array gamma = {trc, trc, trc}; auto pipeline1_B = cmsStageAllocToneCurves(ctx, T_CHANNELS(TYPE_RGB_16), gamma.data()); // 3. Linear RGB -> XYZ. const std::array rgb_to_xyz = {{0.4124, 0.3576, 0.1805, 0.2126, 0.7152, 0.0722, 0.0193, 0.1192, 0.9505}}; auto pipeline1_C = cmsStageAllocMatrix(ctx, T_CHANNELS(TYPE_RGB_16), T_CHANNELS(TYPE_XYZ_16), rgb_to_xyz.data(), nullptr); // Assemble the YCbCr -> XYZ pipeline. cmsPipelineInsertStage(yCbrPipeline, cmsAT_END, yCbrOffset); cmsPipelineInsertStage(yCbrPipeline, cmsAT_END, yCbrMatrix); cmsPipelineInsertStage(yCbrPipeline, cmsAT_END, pipeline1_B); // M = OETF cmsPipelineInsertStage(yCbrPipeline, cmsAT_END, pipeline1_C); // Matrix = RGB -> XYZ auto lut1 = cmsStageAllocCLut16bit(ctx, RESOLUTION, T_CHANNELS(TYPE_YCbCr_16), T_CHANNELS(TYPE_XYZ_16), nullptr); cmsStageSampleCLut16bit(lut1, &sample, yCbrPipeline, 0); // This LUT is then saved to the profile auto p = cmsPipelineAlloc(ctx, T_CHANNELS(TYPE_YCbCr_16), T_CHANNELS(TYPE_XYZ_16)); cmsPipelineInsertStage(p, cmsAT_END, pipeline1_M); // A = dummy curves // The CLUT is needed because AtoB0 in v2 can only pack a CLUT. cmsPipelineInsertStage(p, cmsAT_END, lut1); // CLUT = YCbr -> XYZ cmsPipelineInsertStage(p, cmsAT_END, cmsStageDup(pipeline1_M)); // B = dummy curves cmsWriteTag(yCbrProfile, cmsSigAToB0Tag, p); // The XYZ -> YCbCr conversion goes as follows: auto yCbrPipeline2 = cmsPipelineAlloc(ctx, T_CHANNELS(TYPE_XYZ_16), T_CHANNELS(TYPE_YCbCr_16)); // 0. Dummy curves for the gamma-uncorrected YCbCr. auto pipeline2_M = cmsStageAllocToneCurves(ctx, 3, NULL); // 1. XYZ -> Linear RGB. const std::array xyz_to_rgb = {{3.2406, -1.5372, -0.4986, -0.9689, 1.8758, 0.0415, 0.0557, -0.2040, 1.0570}}; auto pipeline2_C = cmsStageAllocMatrix(ctx, T_CHANNELS(TYPE_XYZ_16), T_CHANNELS(TYPE_RGB_16), xyz_to_rgb.data(), NULL); // 2. Linear RGB -> Normalized R'G'B. Source: ITU-R BT.601-7, ss. 2.6.4 auto trcI = cmsBuildParametricToneCurve(ctx, 5, rec601Parameters.data()); const std::array gamma_i = {trcI, trcI, trcI}; cmsStage *pipeline2_B = cmsStageAllocToneCurves(ctx, 3, gamma_i.data()); // 3. Normalized R'G'B -> YCbCr. Source: ITU-R BT.601-7, ss. 3.3 const std::array rgb_to_ycbcr = { {0.299, 0.587, 0.114, 0.701 / 1.402, -0.507 / 1.402, -0.114 / 1.402, -0.299 / 1.772, -0.587 / 1.772, 0.886 / 1.772}}; // 4. Chrominance channels are [-0.5, 0.5]. Adjust. // The offset is applied after the transform, so no additional matrix is // needed. const std::array offset_i = {0, 0.5, 0.5}; auto pipeline2_Matrix = cmsStageAllocMatrix(ctx, T_CHANNELS(TYPE_RGB_16), T_CHANNELS(TYPE_YCbCr_16), rgb_to_ycbcr.data(), offset_i.data()); cmsPipelineInsertStage(yCbrPipeline2, cmsAT_END, pipeline2_C); // Matrix = XYZ -> RGB cmsPipelineInsertStage(yCbrPipeline2, cmsAT_END, pipeline2_B); // M = OETF^-1 cmsPipelineInsertStage(yCbrPipeline2, cmsAT_END, pipeline2_Matrix); auto lut2 = cmsStageAllocCLut16bit(ctx, RESOLUTION, T_CHANNELS(TYPE_XYZ_16), T_CHANNELS(TYPE_YCbCr_16), NULL); cmsStageSampleCLut16bit(lut2, &sample, yCbrPipeline2, 0); auto p2 = cmsPipelineAlloc(ctx, T_CHANNELS(TYPE_XYZ_16), T_CHANNELS(TYPE_YCbCr_16)); cmsPipelineInsertStage(p2, cmsAT_END, pipeline2_M); // B = dummy cmsPipelineInsertStage(p2, cmsAT_END, lut2); // CLUT = R'G'B' -> YCbr cmsPipelineInsertStage(p2, cmsAT_END, cmsStageDup(pipeline2_M)); // A = dummy cmsWriteTag(yCbrProfile, cmsSigBToA0Tag, p2); // Bradford transform from D65 (BT.709-6) to D50 (ICC 4.3) auto bradford = cmsReadTag(baseProfile, cmsSigChromaticAdaptationTag); cmsWriteTag(yCbrProfile, cmsSigChromaticAdaptationTag, bradford); if (!cmsMD5computeID(yCbrProfile)) std::cerr << "Failed MD5 computation" << std::endl; if (!cmsSaveProfileToFile(yCbrProfile, "krita_bt601-7_ycbcr_v2.icc")) std::cerr << "CANNOT WRITE PROFILE" << std::endl; } ```

However, doing so completely breaks the gamut of the profile as viewed in Krita's Color Space Selector. I've narrowed it down to the type 4 parametric curve I use for the linearization step after the YCbCr conversion; if I replace it by another type, e.g. cmsBuildGamma(ctx, 2.2), the problem disappears. This also doesn't occur in V4 profiles, where this step can be stored explicitly in the M curves of the AtoB0 tag.

@mm2 I'm not sure what is going on here, would you take a look?

amyspark commented 2 years ago

UPD: I worked around it by replacing the parametric curve by a tabulated version, e.g.

Example

```cpp auto trc = [ctx, baseProfile]() { auto trc = reinterpret_cast(cmsReadTag(baseProfile, cmsSigRedTRCTag)); std::array test{}; for (auto i = 0; i < 1024; i++) { cmsFloat32Number x = 1.0f/ 1024.0f * (cmsFloat32Number)i; test[i] = cmsEvalToneCurveFloat(trc, x); } return cmsBuildTabulatedToneCurveFloat(ctx, test.size(), test.data()); }(); ```

I believe this is definitely a bug, though not sure where inside LCMS.

mm2 commented 2 years ago

A quite complicated code, I see. May I ask for a small snippet that demonstrates the issue? I mean, a number across a parametric curve type 4 that results in something not good. Thanks

amyspark commented 2 years ago

I mean, a number across a parametric curve type 4 that results in something not good.

Based on my workaround, this issue seems to happen only in conjunction with a complete pipeline. I tried keeping only the gamma-RGB-XYZ part and the result's gamut is correct, though.

amyspark commented 2 years ago

@mm2 I generated two separate profiles; one with the parametric curve swapped for the tabulated version, the other without. I sampled both pipelines with a 24x24x24 CLUT on both 16bit and floating point.

The first diverging point I found was:

(lldb) fr s 9
frame #9: 0x0000000100003c1e testClut`sample(In=0x00007ffeefbff160, Out=0x00007ffeefbfef60, cargo=0x00007ffeefbff1f0) at test.cpp:19:9
   16       cmsPipelineEvalFloat(In, test, x->b);
   17  
   18       if (Out[0] != test[0] || Out[1] != test[1] || Out[2] != test[2])
-> 19           throw std::runtime_error("FAIL!");
   20  
   21       return TRUE;
   22   }
(lldb) fr v
(const cmsFloat32Number *) In = 0x00007ffeefbff160
(cmsFloat32Number *) Out = 0x00007ffeefbfef60
(void *) cargo = 0x00007ffeefbff1f0
(pipelines *) x = 0x00007ffeefbff1f0
(cmsFloat32Number [3]) test = ([0] = 0.111345083, [1] = 0.222674906, [2] = 0.0371099412)
(lldb) parray 3 In
(const cmsFloat32Number *) $0 = 0x00007ffeefbff160 {
  (const cmsFloat32Number) [0] = 0
  (const cmsFloat32Number) [1] = 0
  (const cmsFloat32Number) [2] = 0
}
(lldb) parray 3 Out
(cmsFloat32Number *) $1 = 0x00007ffeefbfef60 {
  (cmsFloat32Number) [0] = 0.00772106508
  (cmsFloat32Number) [1] = 0.17367819
  (cmsFloat32Number) [2] = 0
}

YCbCr [0, 0, 0] is mapped to XYZ [0.00772106508, 0.17367819, 0] but XYZ [0.111345083, 0.222674906, 0.0371099412] in the CLUT version.

For 16-bit, the first diverging point is also [0, 0, 0]:

(cmsUInt16Number [3]) test = ([0] = 7297, [1] = 14593, [2] = 2432)
(lldb) parray 3 In
(const cmsUInt16Number *) $0 = 0x00007ffeefbff180 {
  (const cmsUInt16Number) [0] = 0
  (const cmsUInt16Number) [1] = 0
  (const cmsUInt16Number) [2] = 0
}
(lldb) parray 3 Out
(cmsUInt16Number *) $1 = 0x00007ffeefbff080 {
  (cmsUInt16Number) [0] = 506
  (cmsUInt16Number) [1] = 11382
  (cmsUInt16Number) [2] = 0
}

Let me know if you need the complete profile generation code for each instance.

mm2 commented 2 years ago

Not sure about which code are debugging. Anyway if you compare 3D LUT interpolation versus a math expression, the results hardly depends on how you build the LUT. Keep in mind LUT's are interpolated linearly (tetrahedral or trilinear) and that means the indexing space should be as much linear as possible and the domain should be as wide as possible. Linear because the output will be linearly interpolated inside the cube of near nodes, and wide domain because you need to use as many nodes as possible. This is the reason of pre and post linearization tables, to provide linear space for 3D LUT.

Example: a function f(x) = 2x -3 works fine in a 3D LUT, a function f(x) = 2x^2 does not because is is not linear, but using a linearization table t(x) = x^2 things got simplified,

mm2 commented 2 years ago

Not further comments so I close the issue

amyspark commented 2 years ago

Hi @mm2, I did not comment further because I don't know how else to demonstrate that there is a bug with how LittleCMS processes parametric curves as part of V2 pipelines. I already attached how to generate a profile demonstrating the bug, and how to work around it.

mm2 commented 2 years ago

OK let's reopen it

mm2 / Little-CMS

Possible bug in parametric curve handling inside cmsPipeline? #301