graspnet / graspnet-baseline

Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)
https://graspnet.net/
Other
475 stars 142 forks source link

What is the unit of point coordinate? #110

Open LeVHoangduc opened 4 months ago

LeVHoangduc commented 4 months ago

Hi! I have some questions about how you create the point!

I saw in your code, the point is created by create_point_cloud_from_depth_image() function as below:

def create_point_cloud_from_depth_image(depth, camera, organized=True): """ Generate point cloud using depth image only.

    Input:
        depth: [numpy.ndarray, (H,W), numpy.float32]
            depth image
        camera: [CameraInfo]
            camera intrinsics
        organized: bool
            whether to keep the cloud in image shape (H,W,3)

    Output:
        cloud: [numpy.ndarray, (H,W,3)/(H*W,3), numpy.float32]
            generated cloud, (H,W,3) for organized=True, (H*W,3) for organized=False
"""
assert (depth.shape[0] == camera.height and depth.shape[1] == camera.width)
xmap = np.arange(camera.width)
ymap = np.arange(camera.height)
xmap, ymap = np.meshgrid(xmap, ymap)
points_z = depth / camera.scale
points_x = (xmap - camera.cx) * points_z / camera.fx
points_y = (ymap - camera.cy) * points_z / camera.fy
cloud = np.stack([points_x, points_y, points_z], axis=-1)
if not organized:
    cloud = cloud.reshape([-1, 3])

return cloud

So, 1) the first question is about how to calculate the point x,y,z. What do its coordinate stand for? It is "camera coordinate" or "real word space coordinate"? And this is what I know about the camera: image

And this is equation to transform from the Xc to u,v coordinate (pixel coordinate): image But, In your code provided, you use xmap,ymap are the width & height value(they are not pixel value based on the equation as above) So I am confused as to the point x,y result ,z is from which coordinate system? 2)Second question is why are you using a depth image to create the point cloud and not rgb?

I look forward to your answers and it can help me to understand your code. Thanks a lot!