waymo-research / waymo-open-dataset

Waymo Open Dataset
https://www.waymo.com/open
Other
2.66k stars 609 forks source link

Issue with convert_range_image_to_point_cloud #863

Open mgmike opened 1 month ago

mgmike commented 1 month ago

Hi, I am trying to get the points from lidar data and am having issues with v2.perception.utils.lidar_utils.convert_range_image_to_point_cloud.

I am using python3.10 and tensorflow 2.12 in a docker container with a host system Ubuntu 20.04. Tensorflow is able to see my host gpu and I tested by conducting an operation on two tensors using the gpu.

It looks like even after converting lidar_calibration from a dataframe to LiDaRCalibrationComponent, the extrinsic transform is still a dataframe Series The error below says that it failed to convert NumPy array to Tensor. Also below, the type of lidar_calibration.extrinsic is a Transform which looks correct according to context.py but lidar_calibration.extrinsic.transform is a Series. I dont have any experience with dask dataframes but could this be the issue, that tensorflow is expecting a numpy array but is getting a dask Series?

My code to get the LiDaRCalibrationComponent is as follows:

# Lazily read DataFrames for all components.
association_df = read('camera_to_lidar_box_association')
cam_box_df = read('camera_box')
cam_img_df = read('camera_image')
lidar_box_df = read('lidar_box')
lidar_df = read('lidar')
lidar_calibration_df = read('lidar_calibration')

association_df = association_df[association_df['key.camera_name'] == 1]
cam_box_df = cam_box_df[cam_box_df['key.camera_name'] == 1]
cam_img_df = cam_img_df[cam_img_df['key.camera_name'] == 1]
lidar_df = lidar_df[lidar_df['key.laser_name'] == 1]
lidar_calibration_df = lidar_calibration_df[lidar_calibration_df['key.laser_name'] == 1]

# Join all DataFrames using matching columns
cam_image_w_box_df = v2.merge(cam_box_df, cam_img_df)
cam_obj_df = v2.merge(association_df, cam_image_w_box_df)
# In this example camera box labels are optional, so we set left_nullable=True.
obj_df = v2.merge(cam_obj_df, lidar_box_df, left_nullable=True)
# Group lidar sensors (left), group labels and camera images (right) and join.
df = v2.merge(lidar_df, obj_df, left_group=True, right_group=True)

# Read a single row, which contain data for all data for a single frame.
_, row = next(iter(df.iterrows()))
# Create all component objects
camera_image = v2.CameraImageComponent.from_dict(row)
lidar = v2.LiDARComponent.from_dict(row)
camera_box = v2.CameraBoxComponent.from_dict(row)
lidar_box = v2.LiDARBoxComponent.from_dict(row)
lidar_calibration = v2.LiDARCalibrationComponent.from_dict(lidar_calibration_df)

print(
    f'Found {len(lidar_box.key.laser_object_id)} objects on'
    f' {lidar.key.segment_context_name=} {lidar.key.frame_timestamp_micros=}'
)
for laser_object_id, camera_object_id, camera_name in zip(
    lidar_box.key.laser_object_id,
    camera_box.key.camera_object_id,
    camera_image.key.camera_name,
):
  print(f'\t{laser_object_id=} {camera_object_id=} {camera_name=}')

I have another cell which is where the issue appears:

# Get Range image of top lidar
range_image = lidar.range_image_return1

range_image = v2.perception.lidar.RangeImage
range_image.values = lidar.range_image_return1.values
range_image.shape = lidar.range_image_return1.shape[0]
print(range_image)
print(type(range_image.values))
print(range_image.shape)

print(lidar_calibration.extrinsic.transform[0])
print(type(lidar_calibration.extrinsic))
print(type(lidar_calibration.extrinsic.transform))
print(type(lidar_calibration_df))
print(type(lidar_calibration))
# extrinsic = tf.convert_to_tensor(lidar_calibration.extrinsic.transform)

# TODO: Eventually should add pixel_pose and frame_pose when mulitple cameras are used
points = v2.perception.utils.lidar_utils.convert_range_image_to_point_cloud(range_image, lidar_calibration, keep_polar_features=True)

And the output is:

<class 'waymo_open_dataset.v2.perception.lidar.RangeImage'>
<class 'list'>
[  64 2650    4]
Dask Series Structure:
npartitions=1
    object
       ...
Name: [LiDARCalibrationComponent].extrinsic.transform, dtype: object
Dask Name: try_loc, 6 graph layers
<class 'waymo_open_dataset.v2.column_types.Transform'>
<class 'dask.dataframe.core.Series'>
<class 'dask.dataframe.core.DataFrame'>
<class 'waymo_open_dataset.v2.perception.context.LiDARCalibrationComponent'>
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[13], line 19
     15 print(type(lidar_calibration))
     16 # extrinsic = tf.convert_to_tensor(lidar_calibration.extrinsic.transform)
     17 
     18 # TODO: Eventually should add pixel_pose and frame_pose when mulitple cameras are used
---> 19 points = v2.perception.utils.lidar_utils.convert_range_image_to_point_cloud(range_image, lidar_calibration, keep_polar_features=True)

File ~/.local/lib/python3.10/site-packages/waymo_open_dataset/v2/perception/utils/lidar_utils.py:165, in convert_range_image_to_point_cloud(range_image, calibration, pixel_pose, frame_pose, keep_polar_features)
    142 def convert_range_image_to_point_cloud(
    143     range_image: _v2_lidar.RangeImage,
    144     calibration: _v2_context.LiDARCalibrationComponent,
   (...)
    147     keep_polar_features=False,
    148 ) -> tf.Tensor:
    149   """Converts one range image from polar coordinates to point cloud.
    150 
    151   Args:
   (...)
    163       elongation, x, y, z).
    164   """
--> 165   range_image_cartesian = convert_range_image_to_cartesian(
    166       range_image=range_image,
    167       calibration=calibration,
    168       pixel_pose=pixel_pose,
    169       frame_pose=frame_pose,
    170       keep_polar_features=keep_polar_features,
    171   )
    172   range_image_tensor = range_image.tensor
    173   range_image_mask = range_image_tensor[..., 0] > 0

File ~/.local/lib/python3.10/site-packages/waymo_open_dataset/v2/perception/utils/lidar_utils.py:87, in convert_range_image_to_cartesian(range_image, calibration, pixel_pose, frame_pose, keep_polar_features)
     84   assert frame_pose is not None
     85 range_image_tensor = range_image.tensor
     86 extrinsic = tf.reshape(
---> 87     tf.convert_to_tensor(calibration.extrinsic.transform),
     88     (4, 4)
     89 )
     91 # Compute inclinations mapping range image rows to circles in the 3D worlds.
     92 if calibration.beam_inclination.values is not None:

File ~/.local/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py:153, in filter_traceback.<locals>.error_handler(*args, **kwargs)
    151 except Exception as e:
    152   filtered_tb = _process_traceback_frames(e.__traceback__)
--> 153   raise e.with_traceback(filtered_tb) from None
    154 finally:
    155   del filtered_tb

File ~/.local/lib/python3.10/site-packages/tensorflow/python/framework/constant_op.py:103, in convert_to_eager_tensor(value, ctx, dtype)
    101     dtype = dtypes.as_dtype(dtype).as_datatype_enum
    102 ctx.ensure_initialized()
--> 103 return ops.EagerTensor(value, ctx.device_name, dtype)

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).
mgmike commented 1 month ago

I found a workaround. I added a .compute to the lidar_calibration_df,

lidar_calibration_df = lidar_calibration_df[lidar_calibration_df['key.laser_name'] == 1].compute()

and rebuilt the LiDARCalibrationComponent adding .tolist() to the extrinsic transform and beam_inclination values.

temp_tfm = v2.column_types.Transform
temp_tfm.transform = lidar_calibration.extrinsic.transform.tolist()[0]
temp_bic = v2.perception.context.BeamInclination
temp_bic.min = lidar_calibration.beam_inclination.min
temp_bic.max = lidar_calibration.beam_inclination.max
temp_bic.values = lidar_calibration.beam_inclination.values.tolist()[0]

lc2 = v2.perception.context.LiDARCalibrationComponent(lidar_calibration.key, temp_tfm, temp_bic)

This doesn't seem like the correct solution though. I must be missing something, there's gotta be a better way to do this right?