Open hhkyhhxbc opened 1 month ago
As a "look-up" process, how LUT is differentiable (back-propagatable)?
LUT本身作为查表的过程,为什么是可导(可以反向传播)的呢?
Take the most commonly-used trilinear interpolation for example, the output value of specific pixel comes from the linear combination of that of 8 adjacent vertices, so the gradient is just the weight.
You can refer to CUDA implementation of trilinear interpolation, where line 241-280
indicate how gradient is computed and line197-239
show how the weights is generated.
Of cause, torch.nn.functional.grid_sample
in PyTorch should implement the same thing (see here), you can also refer to their code to see how 'LUT is differentiable'.
And the weights is derived by solving linear equations, take other interpolation e.g. tetrahedral for example, its weights derivation is different form trilinear interpolation, see eq. 2 & 3
in Tetrahedral Interpolation on Regular Grids.
以最常见的三线性插值为例,输出图像某像素的RGB值来源于LUT相邻8个顶点所记录RGB值的线性组合,因此梯度就是线性权重。
参见三线性插值的CUDA实现,第241-280行
展示了反向传播的梯度计算,第197-239行
说明了权重的计算方式。
PyTorch原生的torch.nn.functional.grid_sample
函数也可以实现三线性插值并反向传播(见此处),你也可以参考该函数的源码以了解原理。
以至于权重具体的值是多少,权重其实是由求解线性方程组推导而来。例如,若使用其他插值方式如四面体插值,其权重就与三线性插值不同,推到方式见此文章的公式2与3
。
希望大佬能解我心头之惑 ^—^