ethz-asl / voxblox

A library for flexible voxel-based mapping, mainly focusing on truncated and Euclidean signed distance fields.
BSD 3-Clause "New" or "Revised" License
1.38k stars 358 forks source link

TSDF map from images and pose #316

Open ShrutheeshIR opened 4 years ago

ShrutheeshIR commented 4 years ago

If I have a set of depth and RGB and depth images and the corresponding ground truth pose for each of them stored, how can I use voxblox to purely construct a TSDF map? I dont intend to use ROS. I have gone through the voxblox library component, but I am still not sure how I could wire it to my setup. Specifically, what files would I have to edit for the same?

LanWu076 commented 4 years ago

having the same issue.

YJonmo commented 4 years ago

I gave up on using this as I have RGB-D images and couldn't find a way to convert it to rosbag format.

You could try this: https://github.com/andyzeng/tsdf-fusion-python

ShrutheeshIR commented 4 years ago

Hi @YJonmo . I have checked that out and it's an excellent code that I am using. However I intended to use voxblox since voxblox provides dynamic mapping. The repo that you have linked has a predefined fixed volume. (I.e. the bounds and volume is calculated before the construction of map begins).

YJonmo commented 4 years ago

Yes I know Voxblox is ideal, but there is no straightforward way to convert the RGB-D to the format that Voxblox needs. At least I couldn't find it.

By the way, I have a problem with that TSDF python code. I have the camera pose matrices that Blender gives me for every frame but when I use that TSDF code, it makes a messy pointcloud. I think the camera pose matrix of blender is not the format that TSDF code accepts. Could you tell what format is your camera pose matrix? At the moment the Blender gives me a 4x4 matrix with the 3x3 of it belonging to the rotation and the last column is the translation.

ShrutheeshIR commented 4 years ago

Yes, that format is right. The first 3x3 rotation matrix and the last column is translation. It worked for me on standard datasets, I haven't used blender, so I'm not exactly sure what the issue could be. I would suggest two things. Try one of the two:

  1. Invert the transformation matrix and feed it into the code.
  2. Invert only the rotation matrix, while leaving the translation vector as is. I suggest the two changes, because if you look at the TSDF code, you apply the transformation matrix to the existing volume to fit to your camera frame and do not rotate your current view point cloud to the original volume. I'm sorry I can't be of much help here!
YJonmo commented 4 years ago

Thanks for the info. I tried both of those already. It is like two weeks I am playing with it :/

My last question would be what sensor did you use to record the camera pose? I might be trying to read the manual and understand the pose matrix.

ShrutheeshIR commented 4 years ago

Oh I didn't test it on real data, I got caught up with some other project. But I tested it on the Freiburg RGBD dataset. One final change that I may suggest is to check out the scale of the depth map given by your camera. Typically you divide by 1000 (as done by the TSDF code) but the frieburg dataset has a scale of 5000. When I did not incorporate this change in the TSDF code, I got a messy point cloud, but changing it to 5000 gave accurate results. I can't think of any other possible solutions.