Question about the rotation matrix in the paper and code.

Pbb-land commented 3 months ago

Hello, Thanks for your great work! I noticed that your rotation matrix seems to have an additional transpose operation compared to the original 3DGS. The covariance matrix in your paper is R^T SSR compared to RSSR^T in original 3DGS. Also, in the prepare_scaling_rot function of the released code, you transpose the rotation matrix, after you get the rotation matrix R by stacking the vectors[v0, v1, v2] as you mentioned in your paper. Then you set the transposed rotation matrix as gs._rotation. So, I'm wondering why you transpose the rotation matrix? Dose this provide further gains compared to original 3DGS?

Another question is, how did you optimize both the mesh vertices and the GS attributes? In the code, both rotation and scaling depend on the vertices, and the code does not use detach to cut off the gradient during computation. Does this affect the position of the vertices during backpropagation so that the mesh is not smooth or shifted from the expected position?

waczjoan commented 3 months ago

Hi, Thank you!

I'll begin by addressing the latter part of your query:

The key point to note is that both the scale and rotation are contingent upon the vertices, derived directly from them.

Now, let's delve into various scenarios and considerations:

Generally, if you possess an original mesh and wish to maintain it unchanged, one option is to set vertices_lr to 0. Consequently, the vertices remain stationary, unaffected by any gradients.

However, in our research, we additional scenarios in which a mesh is absent:

Firstly, when we possess an initial mesh, such as one from the FLAME model (a broadly applicable face mesh), we aim for vertices movement to ensure alignment with the object. Similarly, in the absence of any initial mesh model, we may endeavor to estimate the mesh, a focus of other studies. A straightforward approach involves training GS for a specified number of iterations, followed by mesh construction using Gaussian possitions. For further details, please refer to: link In this situation, when there isn't an pre-existing mesh or (for example) FLAME model available, but there's a mesh estimation, we aim to allow as much movement of the mesh as feasible, with the belief that it will result in better adjustments.

So.... Yes, depending on the situation, we allow modification of the mesh during training. Hence, we decided not to do detach any vectors. I think we could have split the code into several of these scenarios in the implementation, but after discussion we decided to leave it that way.

Additional rotation transformation is a matter of.... (only) implementation. Try to comment it out, you should get a similar effect, shown in the image a): ship_rotation

Pbb-land commented 3 months ago

Thank you for responding as quick as you came!

For the latter question, I am trying to reconstruct the dynamic human performers using the GaMeS representation, so I have an initial mesh and would like to optimize this mesh along with the Gaussian (like smplx+d). So in this case, would it be better to add the detach operation if I want my mesh to be smooth and regular and reduce the artifact when driving the reconstruted model?

For the rotation, I tried to comment it out and got a worse result. so this make me really confused which one of R and R^T is the real rotation matrix (where R=[v0,v1,v2] as in the code). Since I'm trying to use the method in PhysGaussian[1] to correct the view direction while driving, I'm trying to figure out the geometric meaning of R=[v0,v1,v2] and why using R^T SSR as the covariance matrix would work better than RSSR^T.

[1] PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics

piotr310100 commented 3 months ago

Hi, you are right there is an oversight in our paper stating that covariance matrix is equal to R^TSSR. It should be the same as in 3DGS i.e. RSSR^T. The formula used inside the code is the same as in 3DGS. How you should look at this is the following. The matrix on the left is a matrix consisting of v0, v1, v2 as columns vectors and the matrix on the right consists of v0, v1, v2 as row vectors. Basically with the notation from our paper matrix R consists of row vectors and with the notation from 3DGS column vectors, but the covariance matrix is the same in both cases.

So to sum it up. The real rotation matrix with the notation used in 3DGS, which you most likely want is R=[v0,v1,v2], where v0, v1, v2 are column vectors calculated as stated in the paper, hence the transposition of the last two dimensions inside the code. The real covariance matrix is then decomposed as Cov=RSSR^T.

Sorry for the confusion.

waczjoan / gaussian-mesh-splatting

Question about the rotation matrix in the paper and code. #7