alexklwong / void-dataset

Visual Odometry with Inertial and Depth (VOID) dataset
Other
119 stars 9 forks source link

How to read dataset, dataset_500 and dataset_1500 in raw data? #5

Closed hjxwhy closed 2 years ago

hjxwhy commented 3 years ago

Hi alex, Thanks for your sharing! There are dataset, dataset_500, dataset_1500, raw.bag in void_raw/, how to read dataset, dataset_500, dataset_1500 and visualization?

hjxwhy commented 3 years ago

There is another question, I found that the sparse depth and ground truth depth on valid pixel all have the same depth value? Are the feature points' depth given by ground truth?

alexklwong commented 3 years ago

Hi Jianxin,

  1. How to read raw.bag, dataset_X? You will need ROS for that and you can access it through the topics.

Here is an example:

import rosbag

color_topic = '/camera/color/image_raw'   
depth_topic = '/camera/aligned_depth_to_color/image_raw'    
imu_topic = '/camera/imu'

bag = rosbag.Bag(os.path.join('/path/to/bag/directory/', 'raw.bag'), 'r')

# bag.read_messages returns a generator, which can be iterated through via .next() function    

color_messages = bag.read_messages(topics=[color_topic])    
depth_messages = bag.read_messages(topics=[depth_topic])

The dataset files contain data structures that we used for an experimental version of XIVO. I've added the data structure to the repository src/vlslam_pb2.py.

Here is an example on how to use it:

import vlslam_pb2

# You may need these constants for the state of the tracked features
feature_status = [
    vlslam_pb2.Feature.READY,
    vlslam_pb2.Feature.KEEP,
    vlslam_pb2.Feature.INSTATE,
    vlslam_pb2.Feature.GOODDROP
]

dataset = vlslam_pb2.Dataset()
with open(os.path.join('/path/to/dataset/directory/, 'dataset_X), 'rb') as fid:
    dataset.ParseFromString(fid.read())

# To get camera parameters:
cam_params = dataset.camera

# To iterate through the packets inside the dataset
for packet in dataset.packets:
    ...
  1. Sparse depth and ground truth all have the same value?

Yes this is by design after incorporating some feedback from users who worked with the dataset before its release. If you would like the raw points (includes both inlier and outlier, so quality of estimate can be bad for some subset of the points), you can read them from the dataset_X files.

On that note, we do have plans to update the dataset (VOID 2.0) as more of our in house algorithms are released.

hjxwhy commented 3 years ago

@alexklwong Thanks for your reply, I have read the packet of the raw dataset, but what do these parameters mean "gwc, xp, xw, z, wg" in packet?

feixh commented 3 years ago

@alexklwong Thanks for your reply, I have read the packet of the raw dataset, but what do these parameters mean "gwc, xp, xw, z, wg" in packet?

gwc means camera to world transformation, it consists of rotation and translation, you can think it as a 3x4 matrix [R | T] where R is the rotation part, and T is the translation part.

xp is the pixel coordinates of the point (if I didn't remember it wrong)

xw is the 3-D coordinates of the point in world coordinate system.

z is the depth of the point as seen from the camera.

wg is the 2 DoF rotation (as rotation about direction of gravity is unobservable) to align world coordinate system to gravity. In math, you can do Exp(wg[0], wg[1], 0) where Exp is the exponential map for SO(3) group, and you enforce the last dimension of the angle-axis representation to be zero so that the rotation has indeed 2 DoF.