mm2 / Little-CMS

A free, open source, CMM engine. It provides fast transforms between ICC profiles.
https://www.littlecms.com
MIT License
549 stars 174 forks source link

Vector and matrix documentation fixes, and about curves #377

Closed ppaalanen closed 10 months ago

ppaalanen commented 1 year ago

I thought I'd contribute the few things I had to ask Marti by email in the past into the LittleCMS documentation (published as PDFs), but I cannot seem to find the sources for it.

How does one contribute to the docs?

I'd mainly clarify the layout of vectors and matrices in various places, as some API uses column-major and other API uses row-major matrices. I also noticed that some public functions from lcms2.h are not documented at all.

mm2 commented 1 year ago

I am not adding the word source file anymore because I found it was abused by selling it as a book. Please send me any improvements you would like in the documentation and I will be glad to review and add them.

BTW, some functions are not documented on purpose. Because the use is restricted or are not intended to be called by general API but rather by plugins.

ppaalanen commented 1 year ago

Oh dear. Ok.

Should API restricted to plugins be in lcms2_plugin.h and never in lcms2.h?

The undocumented functions are cmsGetToneCurveParametricType() and cmsGetToneCurveParams() so we can get a clue of what a cmsToneCurve contains. These look like the opposite of cmsBuildParametricToneCurve(), so it's hard to imagine how they would be unstable API. Are they incomplete though? It seems to me there can be types of cmsToneCurve where calling those two functions would be incorrect to begin with, and I don't know when those two can be called. Do we need cmsIsToneCurveMultisegment() returning false first? And if multi-segment is true, then there is no API to introspect that? That is, no getter matching cmsBuildSegmentedToneCurve()? How do we tell if a curve was defined as tabulated instead of parametric, and how many LUT elements does it have in that case? All this would be nice to find in the documentation.

We have a pipeline optimizer plugin (Weston, FOSS) that attempts to collapse consecutive curves into a single curve if possible, and forcing everything into LUT is not the best thing - two pure power functions would simply result in one new power function. We never use LittleCMS to apply color pipelines, instead we extract the operations in the plugin and program OpenGL shaders or Linux KMS (display controller) to execute the operations more efficiently. Hence, being able to introspect all stages of a pipeline in the plugin is essential. If we cannot do that, I understand we could add our own pipeline stage types, so we would need to replace all opaque stages with home-brewn ones. We do have the worst case fallback of sampling a whole pipeline into a 3D LUT, but obviously that has precision problems that may become prohibitive.

In the plugin API documentation:

In the public API documentation:

Other things I just noticed:

If some of these are defined by the ICC spec and you do not want to repeat the definition, then a reference to the ICC spec would be in order. It is hard to guess which API follows ICC spec and which API is your own.

Btw. I have seen and I think honored the "none of your business" comments in the headers. If more things are like that, then they would be good to mark so. Otherwise I just assume that they have been forgotten to document but are free to use based on which header they reside in.

butcherg commented 1 year ago

I'm also thinking through how to use LittleCMS in a GPU pipeline (vkdt). The current public API let's me get at the matrices, and build a LUT from a TRC, but there may be room for more interface...

@hanatos...

mm2 commented 1 year ago

@ppaalanen Thank you so much for your detailed explanation. I guess It will take me some days to add all this but is nice to have a list of the missing parts. I guess I could add much of what you are asking for on next weekend.

ppaalanen commented 1 year ago

@mm2, wonderful, thank you. I'll be happy to review them.

mm2 commented 1 year ago

@ppaalanen Ok, since this is a lot of work I have split it in several parts. 1st part, already done, is about cleaning the API for accessing tone curve segments. I have removed cmsGetToneCurveParams, which anyway was not documented, 03020d0fdd0f8a79f275657ad1ff757fe2cc4192 and added a new function cmsGetToneCurveSegment which hopefully solves most of the issues on iterating segments and getting parametric curve settings. 88b4c773d06368f1ecdd4b4ba1b75e1c6694531d

I am attaching here the new manual with this new function documented. It is a draft for new version. LittleCMS2.16 API.pdf

Next week I will review all VEC3 API. Then the stages. Regards

ppaalanen commented 1 year ago

Thanks, cmsGetToneCurveSegment looks good!

ppaalanen commented 1 year ago

We have Weston code about to land, using the new function to print pipeline information: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1245

The next step is using the new function for more than just debug.

The use of the function is gated by a build-time check for now.

ppaalanen commented 1 year ago

@mm2, is it expected that cmsGetToneCurveSegment returns no segments for a certain type of tabulated curves?

We have the following workaround to print such curves: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1297/diffs?commit_id=625a74d3692fd55d7cfc189fb6a1d89329e74e87

I see this case is called "limited precision". It has implicit domain of [0.0, 1.0] I presume. Is this a distinction you want to keep, or should cmsGetToneCurveSegment perhaps return a segment for this anyway?

I don't think it makes a big difference to us, it's just a bit more code in the user to account for this case. Future docs would be really nice to explain this.

mm2 commented 1 year ago

@ppaalanen, do you have a sample profile with such curves? I use empty curves for identity in built-in profiles, should not happen on valid file-based ones.

ppaalanen commented 1 year ago

Hmm, it seems to come from VCGT, but currently we manufacture ICC files with VCGT using LittleCMS ourselves, so I'm not sure if we do something wrong there.

This is one file that should exhibit it: test-vcgt.icm.zip

We don't load VCGT into video card directly, instead we put it as part of the pipeline so we don't rely on video cards to apply it. (The Wayland compositor is the only one who can load anything to a video card anyway, and nothing can bypass the compositor's color management, so this is safe.)

Essentially we get the curves by

    const cmsToneCurve * const *vcgt_curves;
...
    vcgt_curves = cmsReadTag(hProfile, cmsSigVcgtTag);

and I suspect that's where they come from.

mm2 commented 1 year ago

Yep, in this case the tone curve you get is a "limited precision" one. Fast to evaluate in 16 bits but with no segments or functions. So it has no curve at all, only a coarse preview in 16 bits, hence the "segments==0"

This is what manual says:

A 16 bit simplification of the *whole* curve is kept for optimization purposes. and cmsEvalToneCurve16: This function is significantly faster than cmsEvalToneCurveFloat, since it uses a pre-computed 16-bit lookup table.

Those curves are intended to be uploaded to the graphics card, not sure if concatenating them is a good idea.

ppaalanen commented 1 year ago

Those curves are intended to be uploaded to the graphics card, not sure if concatenating them is a good idea.

I know, and yes. We want to be using the display hardware color pipeline for a lot more than just VCGT, and VCGT will usually not even exist. We do need to use the VCGT if someone sets up a display profile with one, because aside from the Wayland compositor, there is literally nothing that could load it anywhere. It seems a waste to reserve a display hardware color pipeline stage just for VCGT. It wouldn't do anything significantly different than we do.

Curve segments can contain tabulated material as well, so VCGT does not seem like it needs to be special on its own right. All the higher precision information was already destroyed when an ICC file with a VCGT was created, and the file is all we have. Going forward, I don't really see why one would want to use a VCGT when nothing can bypass the compositor color management. I can see why VCGT is useful in the X11 world, as the display server does no color management at all.

Looking at it from the library API point of view, there seems to be two kinds of cmsToneCurve that need explicitly different handling. It's not a problem, just a bit more typing.

Tables loaded into monitors instead of video cards may be a whole another thing, I'm not criticizing them.

mm2 commented 10 months ago

Should be fixed in 2.16

ppaalanen commented 9 months ago

Thanks!

The cmsStageAllocMatrix() doc looks good. Just one thing raised my eye-brow:

Offset[]: Points to a vector of [Cols], NULL if no offset is to be applied.

The matrix is of size [Rows, Cols], so it has Rows number of rows, and Cols number of columns. The layout in memory is row-major. That's all fine. But how is it possible for the offset vector to have Cols elements, should it not be Rows?

Is the operation not out = matrix * in + offset?

If it is, then in must have Cols elements, and both offset and out must have Rows elements. This is not affected by the memory layout used by the matrix.

Or is that a very subtle way to say that it computes out = matrix * (in + offset) instead? Would be nice to spell out the formula. If it helps in any way, I have written a disambiguation.


Other observations:

mm2 commented 9 months ago

Thanks for pointing out 02162409c42065e865bc0dacb3fc6f9b15711b05 should fix all those issues.

ppaalanen commented 9 months ago

cmsStageAllocMatrix looks good!

cmsStageAllocCLut16bit: the table layout talks about bytes, but I think it should talk about 16-bit words / shorts, right?

The wording "the dimension ... varies least/most rapidly" is not very familiar to me, but I think I understood it right. If d is the index of dimension, i_d is the element index along dimension d, n_d is the length of the dimension, e is the index of an element in the output vector, and L is the length of the output vector, then the output vector element table[f] comes from

f(i_0, i_1, ..., i_d, e) = e + L * (i_0 + i_1 * n_1 + i_2 * n_1 * n_2 + i_3 * n_1 * n_2 * n_3 + ... )
= e + L * (i_0 + n_1 * (i_1 + n_2 * (i_2 + n_3 * (i_3 + ... ))))

I presume all cmsStageAllocCLut*() functions follow the same pattern, only table element type changes, but that would be nice to mention.

_cmsMAT3per(): are you sure it computes r = a * b? It seems to compute r = b * a to me.

_cmsMAT3solve() doc looks good, though I did not check the code matches.

_cmsMAT3eval() seems to be computing r = a^T * v, not r = a * v?