Dataset npz interpretation

mkocabas / SPEC

Code for ICCV2021 paper SPEC: Seeing People in the Wild with an Estimated Camera

Other

228 stars 31 forks source link

Dataset npz interpretation #11

Open isarandi opened 2 years ago

isarandi commented 2 years ago

Dear Muhammed,

Thank you for this work and the datasets! I'm trying to interpret the data in the SPEC-SYN npz data files, but I'm not sure what each key means. Is there documentation? They are the following: ['imgname', 'center', 'scale', 'pose', 'shape', 'part', 'mmpose_keypoints', 'openpose', 'openpose_gt', 'S', 'focal_length', 'cam_rotmat', 'cam_trans', 'cam_center', 'cam_pitch', 'cam_roll', 'cam_hfov', 'cam_int', 'camcalib_pitch', 'camcalib_roll', 'camcalib_vfov', 'camcalib_f_pix']

At this point, I'd just like to plot the 24 SMPL joints on the image. Based on the array shape, I assume 'S' contains the joints. Is cam_rotmat the rotation from world space to camera space? Is the cam_trans the position of the camera or the top right part of the extrinsic matrix? I assume for this plotting exercise I can ignore everything except S, cam_rotmat, cam_trans and cam_int. Still for some reason the points end up at at very wrong places. Maybe 'S' is something else? Or am I using the camera params wrong?

Thanks! Istvan

isarandi commented 2 years ago

Ignoring 'S', using 'pose', 'shape' (with the gender-neutral SMPL model, is that correct?) and 'cam_rotmat' and manually fiddling with the translation vector I managed to get an approximate overlap, but I can't seem to find how to determine the person-specific translation vector. 'cam_trans' is the same for each person in the same image.

isarandi commented 2 years ago

Sorry for bothering again. I've tried out the SPEC-MTP dataset's npz file as well, but this has fewer keys, for example 'cam_int' is not there. How can I build it from the 'focal_length'? It seems that the 'focal_length' field contains smaller numbers than I would expect, but the 'camcalib_f_pix' value is larger than expected to be the value to be used in the intrinsic matrix. In this file, I could also not find 'cam_trans'.

Would it be possible to extend the annotation file with these values?

rootpine commented 1 year ago

@isarandi Hello, I am facing the same issue, so how did you solve this ? Thanks.

isarandi commented 1 year ago

For SPEC-SYN, I ended up computing the extrinsic camera matrix from 2D-3D joint correspondences.

This is the Perspective-n-Point (PnP) problem, and can be solved using cv2.solvePnP() or cv2.calibrateCamera() in OpenCV.

You can get 2D-3D correspondences by pairing up the common subset of joints from 'openpose_gt' (to get the 2D points) and the SMPL joints in 3D. To get SMPL joints, you have to run forward kinematics on the SMPL body model using 'pose' and 'shape', to get 3D joints. For solving PnP you also need the camera intrinsic matrix, that's given in 'cam_int'.

The shared joints between the openpose skeleton and the SMPL skeleton are the neck, shoulders, elbows, wrists, pelvis, hips, knees and ankles. On these joints the two skeletons line up perfectly (I assume 'openpose_gt' was created by projecting these SMPL joints to 2D, but the extrinsic matrix got lost in the process).

For SPEC-MTP, I haven't yet managed to get the 3D poses aligned with the image.

rootpine commented 1 year ago

@isarandi I could solve the problem, thank you for your kind reply ... !!!