mpitropov / cadc_devkit

A devkit for the Canadian Adverse Driving Conditions (CADC) dataset.
http://cadcd.uwaterloo.ca/
Other
149 stars 35 forks source link

Unable to align the point clouds between two frames #27

Closed cmpute closed 3 years ago

cmpute commented 3 years ago

Hi! Thanks for your amazing work on this dataset. I wanted to align two point clouds together with the given sensor intrinsics and novatel data. However the best I can do right now is illustrated below, in which there are still noticeable error between two frames (these two are 25 frames apart).

2021-08-31_15-07

Currently how I parse the pose data is by the following code

import utm
from scipy.spatial.transform import Rotation

def parse_pose_from_inspvax(data: INSPVAX):
    x, y, *_ = utm.from_latlon(data.latitude, data.longitude)
    t = [-x, -y, data.altitude + data.undulation]
    r = Rotation.from_euler("yxz", [data.roll, data.pitch, -data.azimuth], degrees=True)

The pose parsing is adopted from your code with some trial-n-error. Then the point cloud transform between two frames can be obtained by lidar-novatel extrinsics + pose difference + inverse lidar-novatel extrinsics

I noticed that the demo in the dataset website have pretty good alignment, so could you please give an example code for aligning point cloud from two different frames? Thank you!

mpitropov commented 3 years ago

To submit our data to Scale AI they wanted all data converted to a world frame. I converted each GPS message to a position in the ENU frame and used the first position in a sequence as the origin followed by subtracting this origin position from each consecutive frame. This is exactly what I did in my code here: https://github.com/mpitropov/cadc_devkit/blob/master/convert_novatel_to_pose.py to create the smooth GPS pose visualization https://github.com/mpitropov/cadc_devkit/blob/master/images/2019_02_27_0027_vehicle_path.png

I'm using the same steps in my conversion script for Scale AI that I have for the demo script here.

I think your issue could be that the rotation code is different from the one I use. The rotation matrix definition is taken from https://novatel.com/support/support-materials/application-notes "Vehicle to Body Rotations Application Note | APN-037"

cmpute commented 3 years ago

Thanks for your quick response! Did you use the extrinsic between the lidar frame and novatel frame in the process?

mpitropov commented 3 years ago

Yes, that is an important thing to add. The poses were actually the lidar poses in the world frame. I converted from Lidar to GPSIMU to ENU.

    translation = transformations.translation_from_matrix(T_ENU_GPSIMU * T_GPSIMU_LIDAR);
    quat = transformations.quaternion_from_matrix(T_ENU_GPSIMU * T_GPSIMU_LIDAR);
    print(transformations.euler_from_matrix(T_ENU_GPSIMU, axes='syxz'))
    print(transformations.euler_from_matrix(T_ENU_GPSIMU * T_GPSIMU_LIDAR, axes='syxz'))

    json_data['device_position'] = {
      "x": translation[0],
      "y": translation[1],
      "z": translation[2]
    };
    json_data['device_heading'] = {
      "x": quat[0],
      "y": quat[1],
      "z": quat[2],
      "w": quat[3]
    };
cmpute commented 3 years ago

Thanks for the information! Could you elaborate on what's the content of T_ENU_GPSIMU?

mpitropov commented 3 years ago

The main question that I can't answer is: Why can't I have the origin pose as the first GPSIMU pose and move from there in the GPSIMU frame? I'm not too sure why this cannot be done, it seems like you should be able to do this if you understand the coordinate frames but I might have tried this a while back and then gave up and used the Novatel transform.

The T_ENU_GPSIMU is the transform where the rotation is defined by Novatel for converting the body (IMU) frame to the ENU frame. I would follow my example code as I have already debugged it to check that the poses and lidar data looks fine when displayed in the ENU frame.

https://hexagondownloads.blob.core.windows.net/public/Novatel/assets/Documents/Bulletins/apn037/apn037.pdf

The body to local level matrix can be formed from the roll, pitch, azimuth (convert to yaw to form the matrix) angles found in the INSPVA (or INSATT) log, where roll is about the y-axis, pitch is about the x-axis, and azimuth/yaw is about the z-axis. The body frame (IMU axes) are nominally assumed to have Y pointing in the direction of travel, X to the right (perpendicular to the direction of travel), and Z up. That means that azimuth references from north to the IMU y-axis, and roll is about y, and pitch is about x. If you do not mount your IMU with Z approximately up, please review the Frame Definitions section, on Page 1 of this document.

https://docs.novatel.com/OEM7/Content/SPAN_Operation/Definition_Reference_Frames.htm https://en.wikipedia.org/wiki/Local_tangent_plane_coordinates

cmpute commented 3 years ago

Thanks for the detailed instructions! I will try again these days

cmpute commented 3 years ago

It turned out that there's some bugs in my transformation chaining, and the function I posted above shoud be

def parse_pose_from_inspvax(data: INSPVAX):
    x, y, *_ = utm.from_latlon(data.latitude, data.longitude)
    t = [x, y, data.altitude + data.undulation]
    r = Rotation.from_euler("yxz", [data.roll, data.pitch, -data.azimuth], degrees=True)

Now I can get pretty accurate alignment between point clouds and bounding boxes. Here's my custom loader for the dataset in case anyone is interested: https://github.com/cmpute/d3d/blob/master/d3d/dataset/cadc/loader.py