lixiny / bihand

✌🏻 BiHand: Recovering Hand Mesh with Multi-stage Bisected Hourglass Networks [BMVC 2020]
GNU General Public License v3.0
97 stars 16 forks source link

Question about 3D heatmap normalization. #13

Closed imabackstabber closed 2 years ago

imabackstabber commented 2 years ago

Great work, and the code is clear! However,I got a issue with 3D heatmap normalization.In paper(page 6,just below formula.2), it says:"∑_(u,v,d)H_3d(u,v,d)=1",and I wonder why,because that could make the value of u,v,d odd(not in [0,1] but a more converged value space). Relative code(net_upstream.py) and my comment are below.Your reply will be highly appreciated!

class IntegralPose(nn.Module):
    def __init__(self, ):
        super(IntegralPose, self).__init__()

    def forward(self, hm3d):
        """integral heatmap3d to uvd bihand

        Arguments:
            hm3d {tensor (B, 21, D, H, W)}

        Returns:
            uvd {tensor (B, 21, 3)}
        """

        d_accu = torch.sum(hm3d, dim=[3, 4])
        v_accu = torch.sum(hm3d, dim=[2, 4])
        u_accu = torch.sum(hm3d, dim=[2, 3])

        weightd = torch.arange(d_accu.shape[-1], dtype=d_accu.dtype, device=d_accu.device) / d_accu.shape[-1]
        # like [0/32, 1/32, ..., 31/32],representing a normalized position
        weightv = torch.arange(v_accu.shape[-1], dtype=v_accu.dtype, device=v_accu.device) / v_accu.shape[-1]
        weightu = torch.arange(u_accu.shape[-1], dtype=u_accu.dtype, device=u_accu.device) / u_accu.shape[-1]

        d_ = d_accu.mul(weightd) # element-wise multi,so d_ become a weighted average result of d position
        # like [0.1 * 0/32, 0.2 * 1 /32, ..., 0.1 * 31/32]
        # this comply to paper 'Sigma_{u,v,d} Heatmap(j)(u,v,d) = 1'
        # however , I think it should be normalized with perspective to each dimension
        # because u,v,d should be independent(or not?), only by that we can make sure
        # u,v,d all has value space spanned like [0,1],which is known as normalized.
        v_ = v_accu.mul(weightv)
        u_ = u_accu.mul(weightu)

        d_ = torch.sum(d_, dim=-1, keepdim=True)
        v_ = torch.sum(v_, dim=-1, keepdim=True)
        u_ = torch.sum(u_, dim=-1, keepdim=True)

        uvd = torch.cat([u_, v_, d_], dim=-1)
        return uvd
imabackstabber commented 2 years ago

overlooked torch.sum(),my fault.case solved.

lixiny commented 2 years ago

Thanks for your comments! Based on your description, I guess that you want to integral the heatmap at each dimension separately. In such cases, the 3D-heatmap will degenerates to 1D Lixel (see I2l-MeshNet).

In this paper, we follow the soft-argmax operation in Integral human pose regression, which suggests each dimension should not be independent Hope this helps.

imabackstabber commented 2 years ago

Thanks for your comments! Based on your description, I guess that you want to integral the heatmap at each dimension separately. In such cases, the 3D-heatmap will degenerates to 1D Lixel (see I2l-MeshNet).

In this paper, we follow the soft-argmax operation in Integral human pose regression, which suggests each dimension should not be independent Hope this helps.

thanks for your reply and help!