alexklwong / void-dataset

Visual Odometry with Inertial and Depth (VOID) dataset
Other
119 stars 9 forks source link

Scale factor of the stored depthmap #3

Closed muskie82 closed 2 years ago

muskie82 commented 3 years ago

Thank you very much for sharing your work.

In issue #1, you mentioned depth values are in millimeters and need to divide 1000 to get a metric scale, but I think the scale is still incorrect. For example, in office0/image/1551915398.6757.png, the depth value around a room corner is approximately 700~750mm in the stored depth map, but there must be more than 1 meter considering the sofa size.

image

Also, when I visualize office0/image/1551915398.6757.png and office0/image/1551915399.9445.png as a 3D point cloud using gt depth and absolute pose, I can see a clear misalignment of 2 depth maps which should be coming from the wrong scale factor.

Do you have any idea?

image

alexklwong commented 3 years ago

This is strange, I am investigating

muskie82 commented 3 years ago

Maybe I've got the idea. In data_util.py (https://github.com/alexklwong/void-dataset/blob/master/src/data_utils.py#L76), loaded depth data is divided by 256. I guess this is also true to realsense data... the depth map is not in millimeter but x256 meter scale. After dividing 256, the map I've shown above is reasonably aligned. image

alexklwong commented 3 years ago

Sorry it took a while to reply. I went through the sequence to see if there was something wrong, but I didn't find anything strange about it. It seems like you've got it working as well. Just for transparency here is the data processing pipeline we used:

Depth maps were captured by realsense (in millimeters) and aligned to RGB stream based on nearest time stamp and then they were exported and stored as 16 bit .png files using the save_depth function in data_utils.py. To preserve the floating point accuracy, we multiplied them by 256 during the saving process.

So the loading process using load_depth from data_utils.py by default multiplies the depth map (png) by 256. This should then yield depth in millimeters.

To make sure that we are both on the same page, can you post the steps that you did (e.g. code/library/software) to get the first reconstruction above and what you did differently for the second?