AndreGuo / ITMLUT

Official PyTorch implementation of "Redistributing the Precision and Content in 3D-LUT-based Inverse Tone-mapping for HDR/WCG Display" in CVMP2023 (SIGGRAPH European Conference on Visual Media Production).
Mozilla Public License 2.0
29 stars 3 forks source link

您好 有个一直以来的疑问 LUT本身作为查表的过程,为什么是可导(可以反向传播)的呢 #3

Open hhkyhhxbc opened 1 month ago

hhkyhhxbc commented 1 month ago

希望大佬能解我心头之惑 ^—^

AndreGuo commented 3 weeks ago

Q:

As a "look-up" process, how LUT is differentiable (back-propagatable)?

问:

LUT本身作为查表的过程,为什么是可导(可以反向传播)的呢?

A:

Take the most commonly-used trilinear interpolation for example, the output value of specific pixel comes from the linear combination of that of 8 adjacent vertices, so the gradient is just the weight.

You can refer to CUDA implementation of trilinear interpolation, where line 241-280 indicate how gradient is computed and line197-239 show how the weights is generated.

Of cause, torch.nn.functional.grid_sample in PyTorch should implement the same thing (see here), you can also refer to their code to see how 'LUT is differentiable'.

And the weights is derived by solving linear equations, take other interpolation e.g. tetrahedral for example, its weights derivation is different form trilinear interpolation, see eq. 2 & 3 in Tetrahedral Interpolation on Regular Grids.

答:

以最常见的三线性插值为例,输出图像某像素的RGB值来源于LUT相邻8个顶点所记录RGB值的线性组合,因此梯度就是线性权重。

参见三线性插值的CUDA实现第241-280行展示了反向传播的梯度计算,第197-239行说明了权重的计算方式。

PyTorch原生的torch.nn.functional.grid_sample函数也可以实现三线性插值并反向传播(见此处),你也可以参考该函数的源码以了解原理。

以至于权重具体的值是多少,权重其实是由求解线性方程组推导而来。例如,若使用其他插值方式如四面体插值,其权重就与三线性插值不同,推到方式见此文章公式2与3