nkolot / GraphCMR

Repository for the paper "Convolutional Mesh Regression for Single-Image Human Shape Reconstruction"
BSD 3-Clause "New" or "Revised" License
425 stars 67 forks source link

What's the purpose of this processing on rotation matrix? #17

Closed Maqingyang closed 5 years ago

Maqingyang commented 5 years ago

Sorry, I'm new to 3D reconstruction community. I can't figure out what the following code tends to do in class SMPLParamRegressor() .

    def forward(self, x):
        """Forward pass.
        Input:
            x: size = (B, 1723*6)
        Returns:
            SMPL pose parameters as rotation matrices: size = (B,24,3,3)
            SMPL shape parameters: size = (B,10)
        """
        batch_size = x.shape[0]
        x = x.view(batch_size, -1)
        x = self.layers(x)
        rotmat = x[:, :24*3*3].view(-1, 24, 3, 3).contiguous()
        betas = x[:, 24*3*3:].contiguous()
        rotmat = rotmat.view(-1, 3, 3).contiguous()
        orig_device = rotmat.device
        if self.use_cpu_svd:
            rotmat = rotmat.cpu()
        U, S, V = batch_svd(rotmat)

        rotmat = torch.matmul(U, V.transpose(1,2))
        det = torch.zeros(rotmat.shape[0], 1, 1).to(rotmat.device)
        with torch.no_grad():
            for i in range(rotmat.shape[0]):
                det[i] = torch.det(rotmat[i])
        rotmat = rotmat * det
        rotmat = rotmat.view(batch_size, 24, 3, 3)
        rotmat = rotmat.to(orig_device)
        return rotmat, betas

Could you give a short explanation about what's the purpose of using svd and det? What changes have been made to rotation matrix after this processing? And is it a common sense to do so on SMPL model? Because it seems that HMR by Kanazawa didn't process rotation matrix like this.

nkolot commented 5 years ago

HMR uses the axis-angle representation for the rotation matrix, i.e. they regress the 3-dimensional axis angle vector for each joint that is then converted to a 3x3 rotation matrix inside the SMPL model using the Rodrigues formula.

Here we are regressing the full 3x3 rotation matrix, but the regressed matrix won't necessarily be a valid rotation matrix. So we do an extra step of "projecting" it to the manifold of rotation matrices. The process that we are doing here is one of the possible ways of achieving this.

Maqingyang commented 5 years ago

Thanks for your fast reply! It's very informative. It sounds to me a natural choice to regress axis angle vector instead of rotation matrix. The axis angle vector is always valid, which seems superior than rotation matrix,because rotation matrix needs extra process to make it valid. So, why you choose rotation matrix in your paper? Is it based on your prior assumption? Or you have tried both and found that rotation matrix is better? I'm curious about the insight behind this choice. BTW, you really did a good job which seems so different from previous work!

nkolot commented 5 years ago

Although it might seem simpler, regressing axis-angle representations (or Euler angles e.g.) is very challenging.

Other works (e.g. Pavlakos et al. CVPR2018) used axis angle but applied the loss on the rotation matrices instead of the axis angle representations, because distance in axis angle space is not always a good measure of the distance between the rotation matrices.

I tried both at some point and regressing the full 3x3 rotation matrix performed significantly better.

I would suggest reading this paper that discusses the problem of regressing 3D rotations in detail. Here they propose another 6D representation for 3x3 rotation that is continuous. It is hypothesized that it is easier to regress continuous functions with neural networks. I've used this 6D representation in other works and it seems to be working really well.

Maqingyang commented 5 years ago

Thanks! I will read more about it!