waymo-research / waymo-open-dataset

Waymo Open Dataset
https://www.waymo.com/open
Other
2.58k stars 601 forks source link

Motion Dataset: Discrepancy between Point Cloud and Agent Box #816

Open martinchen0776 opened 2 months ago

martinchen0776 commented 2 months ago

Hi!

I have successfully decoded 11 frames of point cloud data and transformed them into the global frame. However, upon analyzing the trajectory data (which is in the global frame too), I've noticed a mismatch between the agent box in the trajectory and the corresponding point cloud.

Steps to Reproduce:

The blue bounding boxes represent agents at time-index 10, while the green ones correspond to agents at time-index 6. Upon comparison, it becomes evident that the green boxes are well-aligned with the point cloud data. Could there be some problems in my point cloud processing pipeline? Below is the code I am utilizing:

from waymo_open_dataset import dataset_pb2
from waymo_open_dataset.utils import womd_lidar_utils
def _get_laser_calib(frame_lasers, laser_name):
    for laser_calib in frame_lasers.laser_calibrations:
        if laser_calib.name == laser_name:
            return laser_calib
    return None
path = os.path.join(lidar_path, f'{scenario_id}.tfrecord')
dataset = tf.data.TFRecordDataset(path, compression_type='')
for cnt, data in enumerate(dataset):
    frame_points_xyz = {}  # map from frame indices to point clouds
    frame_points_feature = {}
    frame_points_xyz_world = {}
    frame_pose_list = []
    scenario = scenario_pb2.Scenario()
    scenario.ParseFromString(bytearray(data.numpy()))
    lidar_data = scenario.compressed_frame_laser_data
    frame_i = 0
    for frame_lasers in lidar_data:
        points_xyz_list = []
        points_feature_list = []
        points_xyz_list_world = []
        # frame_pose: It decides the vehicle frame at which the cartesian points are computed.
        frame_pose = np.reshape(np.array(frame_lasers.pose.transform),(4, 4))
        vehicle_to_world = frame_pose
        vehicle_to_world_rotation = vehicle_to_world[0:3, 0:3]
        vehicle_to_world_translation = vehicle_to_world[0:3, 3]

        for laser in frame_lasers.lasers:
            if laser.name == dataset_pb2.LaserName.TOP:
                c = _get_laser_calib(frame_lasers, laser.name)
                (points_xyz, points_feature,
                    points_xyz_return2,
                    points_feature_return2) = womd_lidar_utils.extract_top_lidar_points(laser, frame_pose, c)
                points_xyz_world = tf.matmul(vehicle_to_world_rotation, tf.transpose(points_xyz))
                points_xyz_world = tf.transpose(points_xyz_world) + vehicle_to_world_translation
                points_xyz_return2_world = tf.matmul(vehicle_to_world_rotation, tf.transpose(points_xyz_return2))
                points_xyz_return2_world = tf.transpose(points_xyz_return2_world) + vehicle_to_world_translation
            else:
                c = _get_laser_calib(frame_lasers, laser.name)
                (points_xyz, points_feature,
                points_xyz_return2,
                points_feature_return2) = womd_lidar_utils.extract_side_lidar_points(laser, c)
                points_xyz_world = tf.matmul(vehicle_to_world_rotation, tf.transpose(points_xyz))
                points_xyz_world = tf.transpose(points_xyz_world) + vehicle_to_world_translation
                points_xyz_return2_world = tf.matmul(vehicle_to_world_rotation, tf.transpose(points_xyz_return2))
                points_xyz_return2_world = tf.transpose(points_xyz_return2_world) + vehicle_to_world_translation
            points_xyz_list.append(points_xyz.numpy())
            points_xyz_list.append(points_xyz_return2.numpy())
            points_feature_list.append(points_feature.numpy())
            points_feature_list.append(points_feature_return2.numpy())
            points_xyz_list_world.append(points_xyz_world)
            points_xyz_list_world.append(points_xyz_return2_world)

        frame_points_xyz[frame_i] = np.concatenate(points_xyz_list, axis=0)
        frame_points_feature[frame_i] = np.concatenate(points_feature_list, axis=0)
        frame_points_xyz_world[frame_i] = np.concatenate(points_xyz_list_world, axis=0)
nicomon24 commented 2 months ago

Hi @martinchen0776, could you please confirm the version of the dataset you're using? We introduced some changes to fix this type of issues with the last version (1.2.1) so just want to make sure this still an issue.

martinchen0776 commented 2 months ago

I am using the latest version (1.2.1). The point cloud data is extracted from the lidar_and_camera folder, while the trajectory data is obtained from the scenario folder.

martinchen0776 commented 2 months ago

Hi @martinchen0776, could you please confirm the version of the dataset you're using? We introduced some changes to fix this type of issues with the last version (1.2.1) so just want to make sure this still an issue.

Hi @nicomon24, I wanted to inquire if there have been any recent developments or progress regarding the issue.

nicomon24 commented 2 months ago

Hi @martinchen0776 thanks for confirming the version, let me check internally and get back to you asap.

martinchen0776 commented 1 month ago

Hi @martinchen0776 thanks for confirming the version, let me check internally and get back to you asap.

Hi, I would like to ensure if there any update regarding this issue? Thank you!

nicomon24 commented 1 month ago

Hi @martinchen0776, sorry about the delay but the people responsible for the lidar part of the dataset have been out of office for the week. I'll let you know ASAP.

martinchen0776 commented 1 month ago

Hi @martinchen0776, sorry about the delay but the people responsible for the lidar part of the dataset have been out of office for the week. I'll let you know ASAP.

Hi, I was wondering if there have been any recent developments or updates regarding this issue.

wind09 commented 1 month ago

Hi martinchen0776, thanks for your interest in the WOMD challenge. The mismatch in the box locations you identified is a known issue with the data, so your processing pipeline is likely correct in this regard.

There is inherent temporal misalignment between the LiDAR frames and original WOMD data. When preparing the LiDAR data, we minimize this skew such that the frequency of shifts greater than 5 frames is only 0.85% in the training split and 0.94% in the validation split. The test split has a similar frequency. It’s our expectation that this small degree of misalignment will not affect research in any substantial way.