Traverse-Research / ispc-downsampler

Image downsampler using a Lanczos filter implemented in ISPC
Other
11 stars 1 forks source link

Perform `degamma` and `gamma` conversions on user request #32

Open MarijnS95 opened 1 year ago

MarijnS95 commented 1 year ago

Fixes #25

This crate can't assume that the input and output is linear, nor did it correct for that in the test example where the output from stb_image is clearly nonlinear (it doesn't state this in the "docs", but is visible from not linearizing JPG and PNG inputs and applying a gamma of 1/2.2 when converting HDR to LDR). While we could request users to pre- correct for this and return linear output to them, it is more efficient to do it within the downsampling algorithm that already runs over all the pixels, and (more importantly!) requiring these parameters in the input forces the caller to think about it.

Unfortunately this has a massive performance regression of 150% (±40ms to ±107ms):

Downsample `square_test.png` using ispc_downsampler
                    time:   [106.94 ms 107.05 ms 107.18 ms]
                    change: [+146.31% +146.79% +147.27%] (p = 0.00 < 0.05)
                    Performance has regressed.

Will need to dissect if this is caused by the pointer refactor or purely the extra ALU overhead.

MarijnS95 commented 1 year ago

Will need to dissect if this is caused by the pointer refactor or purely the extra ALU overhead.

The pointer refactor and additional if checks only seem to account for ±5ms, the rest is purely pow() overhead. And that makes sense: I made the remark above that it "is more efficient to do it within the downsampling algorithm that already runs over all the pixels", but this algorithm visits and applies gamma correction to every pixel so many times with the kernel that it becomes much more inefficient.

The alternatives are either preprocessing the input (but this doesn't seem to be too fast either... :confused:) or simply documenting that the input and output will be linearized and let the user deal with this (if they need to). (We should still have this as a separate step to make our bench and test do the correct thing)

MarijnS95 commented 4 months ago

PR #48 seems to have added unorm and snorm texture formats in addition to the already existing Srgb(a) format. It is yet unused, but that will be the right enum to key the request for (de)linearization off going forward!

MarijnS95 commented 4 months ago

In that sense I do think that this is more so a bug now in that we expose the Srgb(a) vs Rgb(a)U/Snorm capability to the caller, but don't enact on it.