Closed zhan-xu closed 2 years ago
Hello,
There are multiple ways to get intrinsic and extrinsic camera parameters. Here is my recommendation. You might want to use pred_cam
and bboxes
instead (see the output format of VIBE).
Let's say you want to access the 1st tracked human.
pred_cam = vibe_result['pred_cam'][0]
bbox = vibe_result['bboxes'][0]
Use the function below to get the camera parameters.
def get_camera_parameters(pred_cam, bbox):
FOCAL_LENGTH = 5000.
CROP_SIZE = 224
bbox_cx, bbox_cy, bbox_w, bbox_h = bbox
assert bbox_w == bbox_h
bbox_size = bbox_w
bbox_x = bbox_cx - bbox_w / 2.
bbox_y = bbox_cy - bbox_h / 2.
scale = bbox_size / CROP_SIZE
cam_intrinsics = np.eye(3)
cam_intrinsics[0, 0] = FOCAL_LENGTH * scale
cam_intrinsics[1, 1] = FOCAL_LENGTH * scale
cam_intrinsics[0, 2] = bbox_size / 2. + bbox_x
cam_intrinsics[1, 2] = bbox_size / 2. + bbox_y
cam_s, cam_tx, cam_ty = pred_cam
trans = [cam_tx, cam_ty, 2*FOCAL_LENGTH/(CROP_SIZE*cam_s + 1e-9)]
cam_extrinsics = np.eye(4)
cam_extrinsics[:3, 3] = trans
return cam_intrinsics, cam_extrinsics
I hope this helps. Let me know if you still have any questions.
Thanks so much. This code works perfectly! Really appreciate this.
So how about the ROMP?The ROMP format don't have the bboxes .
Hello,
There are multiple ways to get intrinsic and extrinsic camera parameters. Here is my recommendation. You might want to use
pred_cam
andbboxes
instead (see the output format of VIBE).Let's say you want to access the 1st tracked human.
pred_cam = vibe_result['pred_cam'][0] bbox = vibe_result['bboxes'][0]
Use the function below to get the camera parameters.
def get_camera_parameters(pred_cam, bbox): FOCAL_LENGTH = 5000. CROP_SIZE = 224 bbox_cx, bbox_cy, bbox_w, bbox_h = bbox assert bbox_w == bbox_h bbox_size = bbox_w bbox_x = bbox_cx - bbox_w / 2. bbox_y = bbox_cy - bbox_h / 2. scale = bbox_size / CROP_SIZE cam_intrinsics = np.eye(3) cam_intrinsics[0, 0] = FOCAL_LENGTH * scale cam_intrinsics[1, 1] = FOCAL_LENGTH * scale cam_intrinsics[0, 2] = bbox_size / 2. + bbox_x cam_intrinsics[1, 2] = bbox_size / 2. + bbox_y cam_s, cam_tx, cam_ty = pred_cam trans = [cam_tx, cam_ty, 2*FOCAL_LENGTH/(CROP_SIZE*cam_s + 1e-9)] cam_extrinsics = np.eye(4) cam_extrinsics[:3, 3] = trans return cam_intrinsics, cam_extrinsics
I hope this helps. Let me know if you still have any questions.
Hello, you code helps a lot. But I have a few questions.
So how about the ROMP?The ROMP format don't have the bboxes .
hi @Andyen512 , usr ROMP to get cam_intrinsics and cam_intrinsics, have you solved this problem yet ?
Hi everyone,
I'm wondering why FOCAL_LENGTH = 5000 and CROP_SIZE = 224 in this function. Are these two variables fixed for all in the wild videos? Besides, should all the frames from a video share the same camera intrinsics and extrinsics? Thanks.
Hi, did you find a way to solve this?
So how about the ROMP?The ROMP format don't have the bboxes .
大家好,
我想知道为什么这个函数中的 FOCAL_LENGTH = 5000 和 CROP_SIZE = 224。这两个变量是否对所有野生视频都是固定的?此外,视频中的所有帧是否应该共享相同的相机内在和外在?谢谢。
你解决这个问题了吗?我想知道FOCAL_LENGTH应该怎么设置
Hello, thanks for the great work. As I am new to the area, I have a (maybe) simple question: I am trying to get camera poses from VIBE. The output from their code seems to be a 4D array. As described in their code repo:
Can I ask how to get intrinsic and extrinsic matrices?
Or is there an example about how to get these camera parameters from any code?