Bug report - Githubissues

ChiHangChen commented 4 years ago

In lib/utils/transforms.py line 179, 180

        center_new[0] = center_new[0] * 1.0 / sf
        center_new[1] = center_new[1] * 1.0 / sf

It should be

        center_new[1] = center_new[1] * new_ht / ht
        center_new[0] = center_new[0] * new_wd / wd

Or the landmark will be shifted.

zongjuede commented 4 years ago

Hi, Jim, @ChiHangChen Thanks for your valuable issue. It seems that you are expert in heat map encoding and decoding from the issue. Could you please help me explain some code ? I have tried my best to understand it, but failed.

Could you please tell me the purpose of operation "-1" in the row new_pt = np.array([pt[0] - 1, pt[1] - 1, 1.]).T ? `def transform_pixel(pt, center, scale, output_size, invert=0, rot=0): t = get_transform(center, scale, output_size, rot=rot) if invert: t = np.linalg.inv(t) new_pt = np.array([pt[0] - 1, pt[1] - 1, 1.]).T new_pt = np.dot(t, new_pt) return new_pt[:2].astype(int) + 1

`
Could you please tell me the purpose of operation "+1" in the row preds[:, :, 0] = (preds[:, :, 0] - 1) % scores.size(3) + 1 and preds[:, :, 1] = torch.floor((preds[:, :, 1] - 1) / scores.size(3)) + 1 ? I think the highest respond point misses if plus 1 on its coordinate. `def get_preds(scores): """ get predictions from score maps in torch Tensor return type: torch.LongTensor """ assert scores.dim() == 4, 'Score maps should be 4-dim' maxval, idx = torch.max(scores.view(scores.size(0), scores.size(1), -1), 2)

maxval = maxval.view(scores.size(0), scores.size(1), 1) idx = idx.view(scores.size(0), scores.size(1), 1) + 1

preds = idx.repeat(1, 1, 2).float()

preds[:, :, 0] = (preds[:, :, 0] - 1) % scores.size(3) + 1 preds[:, :, 1] = torch.floor((preds[:, :, 1] - 1) / scores.size(3)) + 1

pred_mask = maxval.gt(0).repeat(1, 1, 2).float() preds *= pred_mask return preds`

Could you please help me? Any reply is appreciated.

ChiHangChen commented 4 years ago

Hi @zongjuede , I'm not an expert, I'm also a beginner in this field. I didn't used those functions you mentioned, maybe you can open a new issue and let the author answer your question. Sorry for not being of any help.

zongjuede commented 4 years ago

OK，I will try. Thanks a lot for your reply.

MengHao666 commented 2 years ago

Hi, Jim, @ChiHangChen Thanks for your valuable issue. It seems that you are expert in heat map encoding and decoding from the issue. Could you please help me explain some code ? I have tried my best to understand it, but failed.

Could you please tell me the purpose of operation "-1" in the row new_pt = np.array([pt[0] - 1, pt[1] - 1, 1.]).T ? def transform_pixel(pt, center, scale, output_size, invert=0, rot=0): t = get_transform(center, scale, output_size, rot=rot) if invert: t = np.linalg.inv(t) new_pt = np.array([pt[0] - 1, pt[1] - 1, 1.]).T new_pt = np.dot(t, new_pt) return new_pt[:2].astype(int) + 1

Could you please tell me the purpose of operation "+1" in the row preds[:, :, 0] = (preds[:, :, 0] - 1) % scores.size(3) + 1 and preds[:, :, 1] = torch.floor((preds[:, :, 1] - 1) / scores.size(3)) + 1 ? I think the highest respond point misses if plus 1 on its coordinate. def get_preds(scores): """ get predictions from score maps in torch Tensor return type: torch.LongTensor """ assert scores.dim() == 4, 'Score maps should be 4-dim' maxval, idx = torch.max(scores.view(scores.size(0), scores.size(1), -1), 2) maxval = maxval.view(scores.size(0), scores.size(1), 1) idx = idx.view(scores.size(0), scores.size(1), 1) + 1 preds = idx.repeat(1, 1, 2).float() preds[:, :, 0] = (preds[:, :, 0] - 1) % scores.size(3) + 1 preds[:, :, 1] = torch.floor((preds[:, :, 1] - 1) / scores.size(3)) + 1 pred_mask = maxval.gt(0).repeat(1, 1, 2).float() preds *= pred_mask return preds

Could you please help me? Any reply is appreciated.

Hey, have you figured it out?

lhyfst commented 2 years ago

same question

HRNet / HRNet-Facial-Landmark-Detection

Bug report #63