Open PaTricksStar opened 5 years ago
@PaTricksStar , I have the same question. Also, what about the scale = scale*1.25 in this function
def _xywh2cs(self, x, y, w, h):
center = np.zeros((2), dtype=np.float32)
center[0] = x + w * 0.5
center[1] = y + h * 0.5
if w > self.aspect_ratio * h:
h = w * 1.0 / self.aspect_ratio
elif w < self.aspect_ratio * h:
w = h * self.aspect_ratio
scale = np.array(
[w * 1.0 / self.pixel_std, h * 1.0 / self.pixel_std],
dtype=np.float32)
if center[0] != -1:
scale = scale * 1.25
return center, scale
@Gouiaa This also is what confuse me. @leoxiaobin Could you please answer our questions?
@PaTricksStar @leoxiaobin @wanghao14 @rafikg Have you solved this? I am confused about it.
I think It is just a hyper parameter representing the default w/h of the bounding box. Just leave it alone. Or you can try to email the author to verify .
I think it is just a method they store values of bbox h and w. They divide h/w by 200 and then they get the h and w back in get_affine_transform by multiply scale by 200. It just a hyperparam and you could choose another number.
@rafikg As I say above, scale is just another representation of bbox h and w. I think they multiply scale with 1.25 to expand the bbox, in case the bbox fits the human body too much, which lead to information loss.
https://github.com/Microsoft/human-pose-estimation.pytorch/blob/c3a30c0e1f83e73b3038b1a443becf6b4a19cf1f/lib/dataset/JointsDataset.py#L31 I review the code and find the pixel_std represents the std of human bbox area, right? But why we need to normalize the bbox scale and set it to 200?