The gradients of means3D are all zero.

ZezhouCheng commented 4 months ago

Thank you very much for sharing the implementation! It is very useful. I find one issue when train the Gaussian splatting with optical flow regression loss. It seems that the gradients of means3D are all zero.

rendervar = {
    'means3D': means3D,
    'colors_precomp': rgb_colors,
    'rotations': rotations,
    'opacities': torch.sigmoid(logit_opacities),
     'scales': torch.exp(log_scales),
     'means2D': torch.zeros_like(means3D, requires_grad=True, device="cuda")
}
im, radii, depth, alpha, means2D, conic2D, conic2D_inv, gs_per_pixel, weight_per_gs_pixel, x_mu =  Renderer(raster_settings=camera_params)(**rendervar)

If we run

loss = torch.sum(im)
loss.backward()
print(means3D.grad.sum())

the output is 22097.9277, which is sum of all gradients of means3D

However, if we run

loss = torch.sum(means2D)
loss.backward()
print(means3D.grad.sum())

the output is 0. This is also true for other variables such as conic2D, conic2D_inv, gs_per_pixel, weight_per_gs_pixel, x_mu.

Do you have any suggestions on this issue?

Zerg-Overmind commented 4 months ago

Hi, I also reply here just in case someone cannot see our emails. It should be true that gradient of mean3D w.r.t. conic2D, conic2D_inv, gs_per_pixel, weight_per_gs_pixel is 0 because conic2D and conic2D_inv are 2D covariance matrix and its inverse (both are only related to rotation and translation, but not translation and location as mean3D) and gs_per_pixel is a tensor with indices (no gradient to any other variable) and weight_per_pixel has alpha*T of each tracked Gaussian and it is only responsible to opacity and etc, not Gaussian dynamics (translation, rotation and scaling).

ZezhouCheng commented 4 months ago

This is true. It seems the gradient from proj_means_2D (dL_proj_2D) is not used here.

ZezhouCheng commented 4 months ago

Adding the following code to Line 406 (here) solves my issue:

glm::vec3 dL_dmean3;
dL_dmean3.x = (proj[0] * m_w - proj[3] * mul1) * dL_proj_2D[idx*2+0] + (proj[1] * m_w - proj[3]*mul2)*dL_proj_2D[idx*2+1];
dL_dmean3.y = (proj[4] * m_w - proj[7] * mul1) * dL_proj_2D[idx*2+0] + (proj[5] * m_w - proj[7]*mul2)*dL_proj_2D[idx*2+1];
dL_dmean3.z = (proj[8] * m_w - proj[11] * mul1) * dL_proj_2D[idx*2+0] + (proj[9] * m_w - proj[11]*mul2)*dL_proj_2D[idx*2+1];
dL_dmeans[idx] += dL_dmean3;

Zerg-Overmind commented 4 months ago

Hi, yes, you need to add gradient directly from proj_2D to dL_dmeans following the way as dL_dmean2D. Actually the variable mean2D is the same as proj_2D in implementation.

ZezhouCheng commented 4 months ago

It seems the updated code still does not use dL_proj_2D. dL_dmean0 should be added to dL_dmeans[idx].

Zerg-Overmind commented 4 months ago

Hi, which version you are referring to? It should be here

Zerg-Overmind / GaussianFlow

The gradients of means3D are all zero. #4