waymo-research / waymo-open-dataset

Waymo Open Dataset
https://www.waymo.com/open
Other
2.67k stars 608 forks source link

How to properly correct for Azimuth? #307

Open atyshka opened 3 years ago

atyshka commented 3 years ago

My use case: I'm using a range-view based detector that requires the middle of the image be the forward direction of the vehicle. I noticed in the range image code that a correction is applied to the azimuth of the range image based on the camera intrinsic. I'm using the following code to compute and plot the range image:

file_ds = tf.data.Dataset.list_files('gs://waymo_open_dataset_v_1_2_0_individual_files/training/segment-10082223140073588526_6140_000_6160_000_with_camera_labels.tfrecord')
record_ds = tf.data.TFRecordDataset(file_ds, num_parallel_reads=tf.data.AUTOTUNE)

def parse_frame(data):
    frame = open_dataset.Frame()
    frame.ParseFromString(data.numpy())
    top_image = frame_utils.parse_range_image_and_camera_projection(frame)[0][open_dataset.LaserName.TOP][0]
    range_image_tensor = tf.reshape(tf.convert_to_tensor(top_image.data), top_image.shape.dims)
    frame.context.laser_calibrations.sort(key=lambda laser: laser.name)
    c = frame.context.laser_calibrations[open_dataset.LaserName.TOP-1]
    extrinsic = tf.reshape(tf.convert_to_tensor(c.extrinsic.transform), [4,4])
    beam_inclinations = tf.convert_to_tensor(c.beam_inclinations)
    vehicle_labels = []
    for label in frame.laser_labels:
        if label.type == label.TYPE_VEHICLE:
            vehicle_labels.append([label.box.center_x, label.box.center_y, label.box.length, label.box.width, label.box.heading])
    num_labels = len(vehicle_labels)
    vehicle_labels = tf.convert_to_tensor(vehicle_labels)
    return range_image_tensor, extrinsic, beam_inclinations, vehicle_labels, [num_labels, 5]

def shape(r, e, i, labels, num_labels):
    return tf.reshape(r, [64, 2650, 4])[:, 0:2648, :], tf.reshape(e, [4,4]), tf.reshape(i, [64]), tf.reshape(labels, [-1, 5])

def transform(r, e, i, labels):
    polar_image = range_image_utils.compute_range_image_polar(tf.expand_dims(r[..., 0], 0), tf.expand_dims(e, 0), tf.expand_dims(i, 0))
    cloud = range_image_utils.compute_range_image_cartesian(polar_image, tf.expand_dims(e, 0))
    flattened_cloud = tf.reshape(cloud[..., 0:2], [cloud.get_shape().num_elements() // 3, 2])
    bool_match = box_utils.is_within_box_2d(flattened_cloud, labels)
    bool_match = tf.pad(bool_match, [[0, 0], [1, 0]], constant_values=False)
    print(r.get_shape()[:-1])
    indices = tf.reshape(tf.argmax(bool_match, axis=-1), r.get_shape()[:-1])

    azimuth = polar_image[0, ..., 0]
    azimuth = tf.math.atan2(tf.math.sin(azimuth), tf.math.cos(azimuth))
    height = tf.math.sin(polar_image[0, ...,1]) * tf.math.maximum(polar_image[0, ..., 2], 0)
    range = polar_image[0, ..., 2]
    intensity = r[..., 1]
    mask = tf.greater_equal(r[..., 0], 0)
    mask = tf.where(mask, 1.0, -1.0)
    return {'input_laser': tf.stack([azimuth, height, range, intensity, mask], -1), 'input_xyz': tf.squeeze(cloud[..., 0:], axis=0)}, indices, tf.concat([[[0, 0, 0, 0, 0]], labels], 0)

tensor_ds = record_ds.map(lambda x: tf.py_function(func=parse_frame, inp=[x], Tout=(tf.float32, tf.float32, tf.float32, tf.float32, tf.int32)), num_parallel_calls=tf.data.AUTOTUNE)
tensor_ds = tensor_ds.map(shape)
tensor_ds = tensor_ds.map(transform)

import matplotlib.pyplot as plt

data = next(iter(tensor_ds))
azimuth = data[0]['input_laser'][..., 0].numpy()
intensity = data[0]['input_laser'][..., 3].numpy()
plt.figure(figsize=(64, 20))
plt.imshow(azimuth)
plt.grid(False)
plt.axis('off')
plt.figure(figsize=(64, 20))
plt.imshow(intensity, vmin=0, vmax=1)
plt.grid(False)
plt.axis('off')

Here is the azimuth image: image

Here is the intensity image: image

Note the discontinuity in the azimuth image, it does not go from -pi to +pi since the intrinisic correction is applied.

So, I presumed, the image must be corrected so that the azimuth goes from -pi to +pi for everything to be properly centered. I modified the transform() function like so:

def transform2(r, e, i, labels):
    polar_image = range_image_utils.compute_range_image_polar(tf.expand_dims(r[..., 0], 0), tf.expand_dims(e, 0), tf.expand_dims(i, 0))
    cloud = range_image_utils.compute_range_image_cartesian(polar_image, tf.expand_dims(e, 0))
    flattened_cloud = tf.reshape(cloud[..., 0:2], [cloud.get_shape().num_elements() // 3, 2])
    bool_match = box_utils.is_within_box_2d(flattened_cloud, labels)
    bool_match = tf.pad(bool_match, [[0, 0], [1, 0]], constant_values=False)
    indices = tf.reshape(tf.argmax(bool_match, axis=-1), r.get_shape()[:-1])

    azimuth = polar_image[0, ..., 0]
    azimuth = tf.math.atan2(tf.math.sin(azimuth), tf.math.cos(azimuth))
    height = tf.math.sin(polar_image[0, ...,1]) * tf.math.maximum(polar_image[0, ..., 2], 0)
    range = polar_image[0, ..., 2]
    intensity = r[..., 1]
    mask = tf.greater_equal(r[..., 0], 0)
    mask = tf.where(mask, 1.0, 0.0)
    input_laser = tf.stack([azimuth, height, range, intensity, mask], -1)
    input_xy = tf.squeeze(cloud[..., 0:2], axis=0)
    correction = tf.atan2(e[..., 1, 0], e[..., 0, 0])
    input_laser = tf.roll(input_laser, tf.cast((correction / math.pi) * 2650 / 2, tf.int32), 1)
    input_xy = tf.roll(input_xy, tf.cast((correction / math.pi) * 2650 / 2, tf.int32), 1)
    indices = tf.roll(indices, tf.cast((correction / math.pi) * 2650 / 2, tf.int32), 1)
    return {'input_laser': input_laser, 'input_xy': input_xy}, indices, tf.concat([[[0, 0, 0, 0, 0]], labels], 0)

This uses a tf.roll operation to take into account the extrinsic and roll the range image so that it is now in the -pi to +pi range. Now here are the azimuth and intensity images:

image image

Note that the azimuth now looks correct, but the intensity image does not appear centered on the road. The first, uncorrected intensity image seems more centered on the road. Can someone familiar with the data format (@peisun1115) please clarify whether or not I am correct in applying this roll transformation?

From the Waymo Website:

Each column corresponds to an azimuth. Column 0 (left of the image) corresponds to -x-axis (i.e. the opposite of forward direction). The center of the image corresponds to the +x-axis (i.e. the forward direction). Note that an azimuth correction is needed to make sure the center of the image corresponds to the +x-axis.

The phrasing of this makes it sound like I need to do a roll transformation to re-center the image. And yet, performing such a roll operation does not lead to a centered image.

peisun1115 commented 3 years ago

To correct azimuth, you can generate range image in sensor frame and then roll the range image by sensor yaw which is the offset w.r.t. the forward x-axis in the vehicle frame. To get sensor yaw from extrinsic, you can refer to this which is where the atan2(..., ...) is from in the code.

atyshka commented 3 years ago

@peisun1115 That procedure is exactly what my code is doing. Using range_image_utils.compute_range_image_polar, that calculates in sensor frame and then subtracts the extrinsic to transform to vehicle frame. Then I am rolling the image to account for that yaw. My issue is that the resulting images do not appear properly centered

peisun1115 commented 3 years ago

That seems to be an issue in your code? You can debug it by starting with several points to make sure you code is correct. and then visualizing the point cloud.

atyshka commented 3 years ago

@peisun1115 I did some experiments with the point cloud in tensorflow. In the first image, I colored the image using the azimuth outputted by polar_image, which has the correction factor applied:

polar_image = range_image_utils.compute_range_image_polar(tf.expand_dims(r[..., 0], 0), tf.expand_dims(e, 0), tf.expand_dims(tf.reverse(i, [-1]), 0))
azimuth = polar_image[0, ..., 0]
#Normalize image between -pi and pi
azimuth = tf.math.atan2(tf.math.sin(azimuth), tf.math.cos(azimuth))
Screen Shot 2021-05-19 at 9 51 42 AM

You can clearly see that the azimuth roll-over point is about 45 degrees to the left of the front of the vehicle. This is not what I would expect.

Now, I ran another experiment, where I removed the azimuth correction. I.e. polar image subtracts the correction, so I perform an addition to cancel it out. The code for that looks like this:

polar_image = range_image_utils.compute_range_image_polar(tf.expand_dims(r[..., 0], 0), tf.expand_dims(e, 0), tf.expand_dims(tf.reverse(i, [-1]), 0))
azimuth = polar_image[0, ..., 0]
correction = tf.atan2(e[..., 1, 0], e[..., 0, 0])
azimuth = azimuth + correction
azimuth = tf.math.atan2(tf.math.sin(azimuth), tf.math.cos(azimuth))

And the resulting pointcloud looks like this:

Screen Shot 2021-05-19 at 9 56 50 AM

You can see here that the edges of +pi and -pi are directly behind the vehicle. Thus it seems that the azimuth is more correct without the extrinsic calibration applied.

If I print out the actual correction angle, I get 2.5887 radians. That seems like an incredibly odd angle to mount the top lidar with respect to the vehicle, and my guess is that is where the issue lies?

I have created a minimal reproducible Colab notebook with this experiment if you would like to give it a try: https://colab.research.google.com/drive/1wc9labdFgP2E0v9qmfbKuMxorfP3QSHY?usp=sharing

atyshka commented 3 years ago

@peisun1115 This seems to relate to #20 , which says that the result of compute_range_image_polar is in sensor frame rather than vehicle frame. However, that function calculates an azimuth and then subtracts the extrinsic yaw from it. My interpretation then is that the raw images from the proto ARE corrected and natively stored in vehicle frame. Then compute_range_image_polar un-corrects them so that they go from vehicle frame -> sensor frame. Then compute_range_image_cartesian re-corrects them to be in vehicle frame again. This would explain my test results. Can you please clarify if this is the correct interpretation? I only ask because it seems contrary to the documentation that said

Note that an azimuth correction is needed to make sure the center of the image corresponds to the +x-axis.

If the raw frame data is in vehicle frame rather than sensor frame, then no azimuth corrections ought to be necessary to center it within the frame.