Closed guker closed 4 years ago
The camera intrinsics are required for that step of the process, so you will need to make a rough guess of what the intrinsics could be. Otherwise you are stuck with the predicted coordinates in normalised space.
def infer_depth(self, norm_skel, eval_scale, intrinsics, height, width, z_upper=20000):
"""Infer the depth of the root joint.
Args:
norm_skel (torch.DoubleTensor): The normalised skeleton.
eval_scale (function): A function which evaluates the scale of a denormalised skeleton.
intrinsics (CameraIntrinsics): The camera which projects 3D points onto the 2D image.
height (float): The image height.
width (float): The image width.
z_upper (float): Upper bound for depth.
Returns:
float: `z_ref`, the depth of the root joint.
"""
def f(z_ref):
z_ref = float(z_ref)
skel = self.denormalise_skeleton(norm_skel, z_ref, intrinsics, height, width)
k = eval_scale(skel)
return (k - 1.0) ** 2
z_lower = max(intrinsics.alpha_x, intrinsics.alpha_y)
z_ref = float(optimize.fminbound(f, z_lower, z_upper, maxfun=200, disp=0))
return z_ref
In other words,i can assume suitable camera intrinsics,and combine infer_depth,then convert prediction to camera space, right?
Yep. You can use one of these functions to get the eval_scale
function parameter:
got it, thanks very much!
got it, thanks very much!
I try it, and it works well.
Is the depth predicted here in infer_single.py(step in utils, it is each joint? ) in 2d space, not 3d camera space?
I believe that the breakpoint that you are looking at in your debugger corresponds to this line in infer_single.py
:
At this point the joints (norm_skel3d
) exist in normalised 3D space. Further up in this issue you can see discussion around denormalising (recovering metric units).
Yes...the key is how to denormalise it, shoule I use PoseDataset.denormalise in data/init.py? But I don't have the value of untransform in eval_scale...
Now,I have a video,no camera intrinsics, so how i can convert prediction to camera coordination in inference phase?