FORTH-ModelBasedTracker / MonocularRGB_3D_Handpose_WACV18

Using a single RGB frame for real time 3D hand pose estimation in the wild
164 stars 30 forks source link

Some questions about the functions in the PyCeresIK #15

Closed Vincent-2017 closed 5 years ago

Vincent-2017 commented 5 years ago

Hello, I have some questions about the functions in the PyCeresIK.

compute kp using model initial pose

points2d = pose_estimator.ba.decodeAndProject(pose_estimator.model.init_pose, clb) In the function, the joints angles are used for forward kinematics model and then 3d positions are projected to the image. I want to know the whether the forward kinematics model is included in PyCeresIK and how to use the forward/inverse kinematics model. rgbKp = IK.Observations(IK.ObservationType.COLOR, clb, keypoints) obsVec = IK.ObservationsVector([rgbKp, ]) Could you give a description for the two functions? In your paper, the camera parameters are known and used. When you test your method in the youtube vedio, the generic calibration parameters are metioned, I want to know what does the generic calibration parameters mean? Thank you!

padeler commented 5 years ago

Hello, The hand kinematics are part of the hand model. The model files are in "models/hand_skinned.*" and they are produced using the hand_skinned.blend file. We do not provide the code for creating models in our format from blend files neither. This codebase is part of a bigger library developed internally and it is not open source (at least for now).

You can use the PyCeresIK.decode to convert the 27 hand parameters to 3D points or the PyCeresIK.decodeAndProject that will also project the 3D points to the camera plane described by the calib parameter.

The Calibration is stored into the json file located in the res folder. You can adapt it to your needs.

Regarding IK.Observations it creates a data structure for the 2D joint locations detected by a given camera (hence the clb argument). This data structure is then used by the IK solver. The IK.ObservationType in this work is always COLOR since we are only concerned with RGB input and not depth cameras.

Finally the IK.ObservationVector gets a list of IK.Observations. Different observations might come from different cameras. If you provide multiple cameras with appropriate calibrations (intrinsics, extrinsics distortion), the solver will optimize for all of them. In the WACV18 work we are only working with a single RGB input so this vector contains only one set of observations.

Regarding the "generic" calibration used in the qualitative video. I just set it to some plausible values that fit the video resolutio. To be specific the json file used for the PMJ video is:

{
    "CalibReprError": 0.0,
    "CameraMatrix": [
        [
            675.91,
            0.0,
            609.15
        ],
        [
            0.0,
            678.5,
            358.2
        ],
        [
            0.0,
            0.0,
            1.0
        ]
    ],
    "CameraID": 0,
    "Distortion": [
        0.0,
        0.0,
        0.0,
        0.0,
        0.0
    ],
    "Dims": [
        1280,
        720
    ],
    "Translation": [
        0.0,
        0.0,
        0.0
    ],
    "Rotation": [
        0.0,
        0.0,
        0.0
    ]
}

Note: This is not the intrinsics for the camera that was used in the PMJ video (we did not have access to that camera). It uses focal length and camera center values from a calibration of a camera shooting at the same resolution. Best results will be obtained with proper camera calibration.

Vincent-2017 commented 5 years ago

Thank you very much!

Amebradi commented 4 years ago

Hello Dr.Panteleris, Thanks for the instruction. I have a question regarding IK.Observations. If I already have 3D estimation of joints, what arguments should I pass for Ik.ObservationType to compute inverse kinematics [Currently, for single RGB image, IK.ObservationType.COLOR is being used]. Your help would be greatly appreciated. Thanks.

padeler commented 4 years ago

Hello. I am sorry for the late reply.

There was experimental support for 3D points. However I cannot guaranty that it will work. This codebase is old and no longer maintained.

If you supply observations with ObservationType.DEPTH you can utilize the 3D information. Each observation of type DEPTH is expected to be a Vec4: [x,y,d,c]. Where x,y are pixel coordinates (location of the joint on the image, d is the depth values (as it would be measured by a depth sensor (in your case the Z value of your 3D point), and c is the confidence.