2D to 3D - Githubissues

facebookresearch / VideoPose3D

Efficient 3D human pose estimation in video using 2D keypoint trajectories

Other

3.74k stars 756 forks source link

2D to 3D #84

Open fire17 opened 5 years ago

fire17 commented 5 years ago

Hello there, I've got 2D keypoints using another PoseEstimation program, they are COCO 2D keypoints (just like Detectron)

I've tried using Detectron but its a real bitch to setup (especially caffe2 with CUDA 10.1 FML)

So if anyone could please share a valid output of infer_video as I need to find a way to save my 2D keypoints in a .npz format that will work with the rest of the process (according to your guild)

Is there a script that inputs 2D keypoints, and outputs 3D keypoints? if yes, how does the input data look like? Thank you very much! Tami

fire17 commented 5 years ago

As I mentioned I would like to just take 2D keypoints and convert them to 3D keypoints

I took a look at infer_video

            boxes.append(cls_boxes)
            segments.append(cls_segms)
            keypoints.append(cls_keyps)

        np.savez_compressed(out_name, boxes=boxes, segments=segments, keypoints=keypoints, metadata=metadata)

I need to recreate these objects from my keypoints: cls_boxes, cls_segms, cls_keyps so cls_keyps are my keypoints, cls_boxes should be easy for me to get too, what is cls_segms??

can someone please give me:

print("cls_boxes",cls_boxes)
print("cls_segms",cls_segms)
print("cls_keyps",cls_keyps)

any advice on how to generate them in correct structure will be helpfull :)

THANK YOU!!!

hellozjj commented 4 years ago

Do you solve it?

tsaxena commented 4 years ago

Can you share the solution to this ?

rhljajodia commented 4 years ago

Any updates on this?

rhljajodia commented 4 years ago

So, i figured it out for anyone still wondering. The metadata from detectron just encodes the dimensions of the video in this form: sample: data = numpy.load(<path to .npz>, allow_pickle=True) a = data['metadata'] print(a) output: {'w': 640, 'h': 400}

VideoPose3D then converts it to this form. Following the same procedure, output: {'layout_name': 'coco', 'num_joints': 17, 'keypoints_symmetry': [[1, 3, 5, 7, 9, 11, 13, 15], [2, 4, 6, 8, 10, 12, 14, 16]], 'video_metadata': {'view2.mp4': {'w': 640, 'h': 400}, 'sample_video.mp4': {'w': 1920, 'h': 1080}}}

Hope this helps.