JimmySuen / integral-human-pose

Integral Human Pose Regression
MIT License
471 stars 76 forks source link

I think there is something wrong with steps in softargmax #28

Closed bucktoothsir closed 5 years ago

bucktoothsir commented 5 years ago
    heatmaps = heatmaps.reshape((heatmaps.shape[0], num_joints, z_dim, y_dim, x_dim))

    accu_x = heatmaps.sum(dim=2)
    accu_x = accu_x.sum(dim=2)
    accu_y = heatmaps.sum(dim=2)
    accu_y = accu_y.sum(dim=3)
    accu_z = heatmaps.sum(dim=3)
    accu_z = accu_z.sum(dim=3)

    accu_x = accu_x * torch.cuda.comm.broadcast(torch.arange(x_dim), devices=[accu_x.device.index])[0]
    accu_y = accu_y * torch.cuda.comm.broadcast(torch.arange(y_dim), devices=[accu_y.device.index])[0]
    accu_z = accu_z * torch.cuda.comm.broadcast(torch.arange(z_dim), devices=[accu_z.device.index])[0]

    accu_x = accu_x.sum(dim=2, keepdim=True)
    accu_y = accu_y.sum(dim=2, keepdim=True)
    accu_z = accu_z.sum(dim=2, keepdim=True)
    x = x / float(hm_width) - 0.5
    y = y / float(hm_height) - 0.5
    z = z / float(hm_depth) - 0.5

as your code showed above, thetorch.arange(x_dim) results in [0, 1, 2, 3, ..., x_xim-1]. The max num of sum of accu_x is x_dim-1 as well as accu_y and accu_z. so I think it is invalid to do x / float(hm_width) - 0.5, you should change x = x / float(hm_width) - 0.5 to x = x / float(hm_width-1) - 0.5

lck1201 commented 5 years ago

@bucktoothsir Could you be more specific?

bucktoothsir commented 5 years ago

@lck1201 check my updates

lck1201 commented 5 years ago

Yes, you are correct. We will fix the problem in the future. However, currently we generate ground truth heatmap in consistent way, see function https://github.com/JimmySuen/integral-human-pose/blob/536a7b33cb87c15e1c9a3411a8cc0df87c4f9510/pytorch_projects/common_pytorch/loss/integral.py#L86 So there would not be big problems.

fabro66 commented 5 years ago

@lck1201 Hi~ . Whether it may be better to modify these three lines of code as shown below?

accu_x = accu_x * torch.cuda.comm.broadcast(torch.arange(1, x_dim+1).type(torch.cuda.FloatTensor), devices=[accu_x.device.index])[0]

accu_y = accu_y * torch.cuda.comm.broadcast(torch.arange(1, y_dim+1).type(torch.cuda.FloatTensor), devices=[accu_y.device.index])[0]

accu_z = accu_z * torch.cuda.comm.broadcast(torch.arange(1, z_dim+1).type(torch.cuda.FloatTensor), devices=[accu_z.device.index])[0]

lck1201 commented 5 years ago

@fabro66 Hi, in essence, range shoule be [0, dim]. But I don't know whether performance will differ if changed to torch.arange(1, x_dim+1) or x = x / float(hm_width-1) - 0.5. Maybe you can give a try.