tensorlayer / HyperPose

Library for Fast and Flexible Human Pose Estimation
https://hyperpose.readthedocs.io
1.25k stars 275 forks source link

generate train data for MPII #333

Open w8501 opened 3 years ago

w8501 commented 3 years ago

Someone can answer that for me in the hyperpose/dataset/mpii_dataset/Dataset/mpii_dataset/generate.py

    target_list=[]
    for kpts,head_bbx in zip(kpts_list,bbx_list):
        bbx=np.array(head_bbx).copy()
        bbx[:,2]=bbx[:,2]*4
        bbx[:,3]=bbx[:,3]*4
        target_list.append({
            "kpt":kpts,
            "mask":None,
            "bbx":bbx,
            "head_bbx":head_bbx,
            "labeled":1
        })

What does bbx[:,2]=bbx[:,2]*4 and bbx[:,3]=bbx[:,3]*4 mean? bbox[:,0] is head center x, bbox[:,1] is head center y. So what does BBX mean?

Gyx-One commented 3 years ago

Hello! @w8501 bbx stands for bounding box, bbx[0] is the x for the box, bbx[1] is y, while bbx[2] is weight and bbx[3] is height. For MSCOCO dataset, the labeled bounding box is a rectanguler that contains the whole object( for pose estimation, each object is a person), however, for MPII dataset, the labeled bounding box stands for the rectangular that contains the person's head, so here we multiplt the MPII labeled bounding box by 4 to approximate the whole bounding box that contains a person. :)

w8501 commented 3 years ago

Hello! @w8501 bbx stands for bounding box, bbx[0] is the x for the box, bbx[1] is y, while bbx[2] is weight and bbx[3] is height. For MSCOCO dataset, the labeled bounding box is a rectanguler that contains the whole object( for pose estimation, each object is a person), however, for MPII dataset, the labeled bounding box stands for the rectangular that contains the person's head, so here we multiplt the MPII labeled bounding box by 4 to approximate the whole bounding box that contains a person. :)

Hi! @Gyx-One Thanks for your reply. I take your point.But the bbx[0] is head center x , bbx[1] is head center y. So the top left point is(head center x,head center y), and bottom right point is(head center x+w,head center y+h),which does not contain the whole object.

Gyx-One commented 2 years ago

Hello! @w8501 Sorry to response so late! Thanks for pointing this out! I think you are correct and I'll check the issues that it may cause and fix them. :)