geopavlakos / c2f-vol-demo

Demo code for "Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose", CVPR 2017
Other
81 stars 16 forks source link

How do you get the camera intrinsic? #6

Open sta105 opened 5 years ago

sta105 commented 5 years ago

Hi,

I found K is save in valid.mat, but I couldn't find it in human3.6m dataset. Could you tell me how can I get camera parameters for each video sequence? Thank you:)

Sicong

nulledge commented 5 years ago

Human3.6M provides some useful MATLAB scripts. The H36MCamera class is one of them. With this, you can get every camera intrinsic(F, C, K and P) by simply calling camera.f, camera.c, camera.p, camera.k in MATLAB. The H36MCamera class is in {scripts root}/H36M/H36MCamera.m.

axhiao commented 3 years ago

hi @nulledge do you know how to get the intrinsic and extrinsic TOF depth camera parameter? Thank you!

hanabi7 commented 3 years ago

@axhiao Did you find the camera parameter of the top data? I understand the extrinsic parameters of the tof data is the same with rgb camera one, means they have the same rotation and translation matrix. But i can't obtain the intrinsic parameter of the tof data.

axhiao commented 3 years ago

@axhiao Did you find the camera parameter of the top data? I understand the extrinsic parameters of the tof data is the same with rgb camera one, means they have the same rotation and translation matrix. But i can't obtain the intrinsic parameter of the tof data.

yes, tof camera is very close to one of rgb cameras. Neither do I get the intrinsic parameters. If you get the dataset officially, you could send them an email about the depth camera parameters. And if you get it somehow, please share a copy with me. I'd appreciate it!

hanabi7 commented 3 years ago

@axhiao It seems that the focal length of the tof data can be roughly calculated based on the fact that the rgb camera and the tof data has the same range of picture if you get what i mean. Have you ever try that?

axhiao commented 3 years ago

@hanabi7 I tried cx, cy = width/2, height/2 and fx=fy=110. It seems to make sense visually. How about your choice?

hanabi7 commented 3 years ago

@axhiao The results seem to make sense, but did you find a way to align the tof point cloud with 3d poses or smpl model?

hanabi7 commented 3 years ago

@axhiao I think we could use 2d joint position for accurate focal_length prediction?

axhiao commented 3 years ago

@hanabi7 No. The point cloud is under the camera coordinate, but 3d keypoint is under world coordinate. I think you need a transformation matrix, including rotation and translation, from camera coordinate to world coordinate.

They seem not to provide 2d joint position on depth image.

hanabi7 commented 3 years ago

@axhiao The transformation matrix is the same as the camera 55011271. It's mentioned in the metadata.xml i think? Let me know if you have any update.

hanabi7 commented 3 years ago

@axhiao I find a way to transform the human3.6 2d joints position on rgb picture to 2d joints position on tof picture. I think that would help joints Try focal_length = 248, fx = 110, fy=80

axhiao commented 3 years ago

@hanabi7 That's awesome! I think you mean focal_length=248, cx=110, cy=80? And did you also use the extrinsic parameter of camera 55011271 to transform 3d coordinates to 2d coordinates on depth image?

hanabi7 commented 3 years ago

@axhiao Yes, but the focal length and cx, cy that i am getting is pure through calculation, it might be mistaken.

axhiao commented 3 years ago

@hanabi7 thank you! I think that's very close from the picture you posted above! I'll try those parameters!

axhiao commented 3 years ago

@hanabi7 Could you share the extrinsic parameter with me here? The result i'm getting seems not right.

hanabi7 commented 3 years ago

@axhiao Rotation parameters: [[0.9216646531492915, 0.3879848687925067, -0.0014172943441045224], [0.07721054863099915, -0.18699239961454955, -0.979322405373477], [-0.38022729822475476, 0.9024974149959955, -0.20230080971229314]] Translation parameters: [[-11.93434847209049], [449.41658936445646], [5541.113551868936]] This is the extrinsic parameter that i am using, it belongs to sequence 11 and camera 55011271

hanabi7 commented 3 years ago

@axhiao https://github.com/karfly/human36m-camera-parameters This repository can be helpful

axhiao commented 3 years ago

@hanabi7 thank you so much! I'll try again.

axhiao commented 3 years ago

@hanabi7 I have tried with the parameters you provided. I think there should be a little differences between different subjects about the intrinsic parameters. Because some subjects' joints cannot fit well on depth image after projection from 3d to 2d.

image

hanabi7 commented 3 years ago

@axhiao Can you name the subject's that's working and the subject's that's not working ? The picture that i posted above was the discussion scene from subject 11

axhiao commented 3 years ago

@hanabi7 for example, even for subject 11, this frame S11_SittingDown 1_769 does not fit well either with your parameters.

image

I manually adjust the intrinsic parameters for all subjects. The intrinsics work well for most of cases, but not well on all cases. You could use them as a reference.

S1 = np.array([
        [248, 0,  100.0],
        [0, 248, 75],
        [0, 0, 1.0]
    ])

S5 = np.array([
        [248, 0, 102],
        [0, 248, 75],
        [0, 0, 1.0]
    ])

S6 = np.array([
        [248, 0, 103.5],
        [0, 248, 76],
        [0, 0, 1.0]
    ])

S7 = np.array([
        [248, 0,  102.5],
        [0, 248, 75],
        [0, 0, 1.0]
    ])

S8 = np.array([
        [248, 0,  105.8],
        [0, 248, 79],
        [0, 0, 1.0]
    ])

S9 = np.array([
        [248, 0,  107.5],
        [0, 248, 78.5],
        [0, 0, 1.0]
    ])

S11 = np.array([
        [248, 0,  108],
        [0, 248, 80],
        [0, 0, 1.0]
    ])
hanabi7 commented 3 years ago

@axhiao That's very helpful

hanabi7 commented 3 years ago

@axhiao I wonder how did you transform the cam coordinates point clouds to world coordinates point cloud? I tried world_coordinates = np.dot(cam_coordinates - extrinsic_translation, np.linalg.inv(extrinsic_rotation)) but it seems not working?

axhiao commented 3 years ago

@hanabi7

image is the depth map

def depthmap2points(image, fx, fy):
    h, w = image.shape
    x, y = np.meshgrid(np.arange(w) + 1, np.arange(h) + 1)
    points = np.zeros((h, w, 3), dtype=np.float32)
    points[:,:,0], points[:,:,1], points[:,:,2] = pixel2world(x, y, image, w, h, fx, fy)
    return points

def pixel2world(x, y, z, img_width, img_height, fx, fy):
    # w_x = (x - img_width / 2) * z / fx
    # w_y = (img_height / 2 - y) * z / fy
    w_x = (x - 104.1857) * z / fx
    w_y = (76.9285 - y) * z / fy
    w_z = z
    return w_x, w_y, w_z
hanabi7 commented 3 years ago

@axhiao Do you know how to align the given 3d smpl model with the generated point cloud? Even if I transform the coordinates in the world coordinate system, point cloud and the smpl model still has a giant gap in rotation. snapshot00 The generated point cloud is meshed for better visualization results

axhiao commented 3 years ago

@hanabi7 sorry, I know little about SMPL.

patriciamdr commented 2 years ago

@hanabi7 @axhiao May I asked you how you obtained the c_x and c_y values? Did you use a formula or did you manually tested different values?

axhiao commented 2 years ago

Hi @patriciamdr If you have the right to access the dataset H3.6M, you could ask them if they provide the camera parameters for the TOF data. But as far as I know, they probably do not have it.

Dipankar1997161 commented 1 year ago

@axhiao @hanabi7 How do get the human3.6m extrinsic camera parameters? I checked the repo mentioned above, however, it doesn't look quite accurate to me.

Did anyone else validate? Or does anyone know how the intri and extr camera parameters of h36m dataset?

Would be really useful for my research. THank you so much

nulledge commented 1 year ago

@Dipankar1997161 I think that repository provides accurate parameters. But if you want to get parameters manually, see H36M/H36MCamera.m. All parameters are stored in metadata.xml and H36M/H36MCamera.m extracts camera parameters(f, c, k, p, R, T) from it.

If you are not familiar with MATLAB, my Python script(link) may be helpful. Install MATLAB engine library and just call get_intrinsics function to get f, c, k, p. Also you can edit it to get R and T.

Note that Human3.6M provides only parameters of RGB, but not TOF. It suggests to use parameters of RGB as approximated parameters of TOF. Of course it's not accurate.