kazuto1011 / dusty-gan-v2

Generative Range Imaging for Learning Scene Priors of 3D LiDAR Data (WACV 2023)
https://kazuto1011.github.io/dusty-gan-v2
MIT License
18 stars 1 forks source link

Code about load_pts_as_img function and depth_to_point_map function #2

Closed Huang-yihao closed 11 months ago

Huang-yihao commented 11 months ago

I try to apply the function on nuscene dataset.

  1. My question is can I directly apply the load_pts_as_img function on the point cloud file of nuscene and do I need to change parameters H and W?

  2. In the depth_to_point_map function, there needs a parameter angle, what does it mean and which parameter should I find in the setting of nuscene? Thank you!

kazuto1011 commented 11 months ago

1. load_pts_as_img for nuScenes

The nuScenes data contains serialized (x, y, z, intensity, ring id) points, while the KITTI data contains (x, y, z, intensity) points which can be loaded as follows:

https://github.com/kazuto1011/dusty-gan-v2/blob/ffb1655a304052267214db36273b5a3fd3c9e92c/gans/datasets/kitti.py#L319

You may want to modify the line like:

points = np.fromfile(point_path, dtype=np.float32).reshape((-1, 5))
points = points[:, :4]

If you want to use a scan-unfolding projection, grid_h can also be created directly from the ring id information.

2. The angle in depth_to_point_map

The angle is (B=1, C=2, H, W)-shaped tensor that contains per-pixel azimuth and elevation laser angles. For spherical projection for KITTI, you can generate the angle as follows:

import torch

# Velodyne HDL-64E
H, W = 64, 2048
h_up, h_down = 3, -25
w_left, w_right = 180, -180
device = "cpu"

elevation = 1 - torch.arange(H, device=device) / H  # [0, 1]
elevation = elevation * (h_up - h_down) + h_down  # [-25, 3]
azimuth = 1 - torch.arange(W, device=device) / W  # [0, 1]
azimuth = azimuth * (w_left - w_right) + w_right  # [-180, 180]
[elevation, azimuth] = torch.meshgrid([elevation, azimuth], indexing="ij")
angles = torch.stack([elevation, azimuth])[None].deg2rad()

Please modify the above block accordingly for Velodyne HDL-32E (I'm not sure that).

Huang-yihao commented 11 months ago

Thank you for the response. But I have another question. I revised the load_pts_as_img to the following as you suggested. I set max_depth to 250 to fit Lidar of Nuscenes.

def load_pts_as_img(point_path, scan_unfolding=True, H=64, W=2048):
        # load xyz & intensity and add depth & mask
        min_depth,max_depth = 1.45, 250.0
        points = np.fromfile(point_path, dtype=np.float32).reshape((-1, 5))
        points = points[:,:4]
        xyz = points[:, :3]  # xyz
        x = xyz[:, [0]]
        y = xyz[:, [1]]
        z = xyz[:, [2]]
        depth = np.linalg.norm(xyz, ord=2, axis=1, keepdims=True)
        # mask = (depth > 0).astype(np.float32)
        mask = (depth >= min_depth) & (depth <= max_depth)
        points = np.concatenate([points, depth, mask], axis=1)

        if scan_unfolding:
            # the i-th quadrant
            # suppose the points are ordered counterclockwise
            quads = np.zeros_like(x, dtype=np.int32)
            quads[(x >= 0) & (y >= 0)] = 0  # 1st
            quads[(x < 0) & (y >= 0)] = 1  # 2nd
            quads[(x < 0) & (y < 0)] = 2  # 3rd
            quads[(x >= 0) & (y < 0)] = 3  # 4th

            # split between the 3rd and 1st quadrants
            diff = np.roll(quads, shift=1, axis=0) - quads
            delim_inds, _ = np.where(diff == 3)  # number of lines
            inds = list(delim_inds) + [len(points)]  # add the last index

            # vertical grid
            grid_h = np.zeros_like(x, dtype=np.int32)
            cur_ring_idx = H - 1  # ...0
            for i in reversed(range(len(delim_inds))):
                grid_h[inds[i] : inds[i + 1]] = cur_ring_idx
                if cur_ring_idx >= 0:
                    cur_ring_idx -= 1
                else:
                    break
        else:
            fup, fdown = np.deg2rad(3), np.deg2rad(-25)
            pitch = np.arcsin(z / depth) + abs(fdown)
            grid_h = 1 - pitch / (fup - fdown)
            grid_h = np.floor(grid_h * H).clip(0, H - 1).astype(np.int32)

        # horizontal grid
        yaw = -np.arctan2(y, x)  # [-pi,pi]
        grid_w = (yaw / np.pi + 1) / 2 % 1  # [0,1]
        grid_w = np.floor(grid_w * W).clip(0, W - 1).astype(np.int32)

        grid = np.concatenate((grid_h, grid_w), axis=1)

        # projection
        order = np.argsort(-depth.squeeze(1))
        proj_points = np.zeros((H, W, 4 + 2), dtype=points.dtype)
        proj_points = scatter(proj_points, grid[order], points[order])

        return proj_points

Then I use the following code to calculate the range map.

# Load up the nuScenes mini split.
nusc = NuScenes(version='v1.0-mini', dataroot='/mnt/share/yihao/All_dataset/test', verbose=True)
sd_record = nusc.get('sample', 'ca9a282c9e77460f8360f564131a8af5')
ref_sd_token = sd_record['data']['LIDAR_TOP']
ref_sd_record = nusc.get('sample_data', ref_sd_token)
nbr_rows = 1024
nbr_cols = 2048
dmin, dmax = 1.45, 250.0

# Load pointcloud.
pcl_path = osp.join(nusc.dataroot, ref_sd_record['filename'])
range_image = load_pts_as_img(pcl_path, scan_unfolding=True, H=1024, W=2048)
fig = plt.figure(constrained_layout=True)
plt.imshow(range_image[:,:,:3], cmap="turbo", vmin=-1, vmax=1, interpolation="none")
plt.axis("off")
plt.show()

But I obtain the map like this. image

Could you please help to point out what is wrong? I am very confused. Thank you very much!