andyzeng / tsdf-fusion-python

Python code to fuse multiple RGB-D images into a TSDF voxel volume.
http://andyzeng.github.io/
BSD 2-Clause "Simplified" License
1.22k stars 218 forks source link

Camera Pose #11

Open YJonmo opened 4 years ago

YJonmo commented 4 years ago

Hi there,

Thanks for putting this work in public.

My question may sound silly, but do I need to have the camera pose to be able to use this repository? that's my impression by going through your codes.

What I have is just bunch of RGB-D images and I would like to fuse them to each other get the extended map.

Regards, Jacob

plutoyuxie commented 4 years ago

hi, I have done some work like you mentioned.

  1. set the first or certain frame as reference, its cam pose is np.eye(4)
  2. transform all depth into camera space using camera intrinsics
  3. use ICP to estimate the transform matrix for each frame to the reference in the world space, which is the cam pose of that frame
YJonmo commented 4 years ago

I have simulated data which I get them from Blender with corresponding matrix_world for the camera for every frame. But using this tool box I cannot reconstruct the same point cloud that I see in the Blender. It just shows a mess.

The Blender camera matrix_world is a 3*3 plus the 4rth column being the translation of x, y and z.

ShrutheeshIR commented 4 years ago

Hey @YJonmo , were you able to solve this issue? I revisited this code for a custom dataset, using simulated data from Gazebo and I, too am getting a messy point cloud. I do suspect there could be a problem with my transformation matrix, however, there is no way for me to verify if that is indeed the issue. (I obtained the transformation matrix from Gazebo itself) Does it have something to do with the units of depth data, or something like that? Please let me know if you have solved the issue

YJonmo commented 4 years ago

Yes mate, finally solved. There is a transformation conversion needed: VSLAMMappingFromBlender2DSO

Please check this repo: https://github.com/GSORF/Visual-GPS-SLAM/blob/master/02_Utilities/BlenderAddon/addon_vslam_groundtruth_Blender280.py#L34

ShrutheeshIR commented 4 years ago

Hi @YJonmo Thanks! Since I am not using Blender, I couldn't understand the entire code snippet. A glance at the comments says that I need to convert it to this format. Up = -Y-Axis Right = X-Axis Forward = Z-Axis. The gazebo simulation format is such that Left is +X axis, and up is +Y axis. So I need to find an equivalent conversion right? Please correct me if I am wrong. For the same, if it is possible, could you let me know what to change in the code snippet that you have linked. Thanks a ton!

YJonmo commented 4 years ago

I guess your format should be eventually like this:

    Up = -Y-Axis
    Right = X-Axis
    Forward = Z-Axis

above is the translation. You need to also convert the rotation matrix too (the 3x3). For converting the Blender below matrix was multiplied and then transposed: 1 0 0 0 -1 0 0 0 -1

In your case, you might need to convert the X to -X and Y to -Y. In that case the rotation matrix should be multiplied to -1 0 0 0 -1 0 0 0 1 But not transposed. But to me, Gazebo is an environment for the computer vision and hence it should be compatible with SLAM or TSDF techniques by default, whereas Blender is not created for the computer vision and that's why conversion is needed. I guess you should do some trial and error to find the correct conversion

ShrutheeshIR commented 4 years ago

Hello @YJonmo That makes sense. I will try it out. Thanks once again!

YJonmo commented 4 years ago

These functions might be useful for you (conversion between the quaternion and the work matrix): def quatr2world(Coordinates): # X= np.float64(Coordinates[0]) Y= np.float64(Coordinates[1]) Z= np.float64(Coordinates[2]) W_O= np.float64(Coordinates[3]) X_O= np.float64(Coordinates[4]) Y_O= np.float64(Coordinates[5]) Z_O= np.float64(Coordinates[6]) World=np.zeros((4,4), np.float64) World[0][0]= 1 - 2Y_OY_O - 2Z_OZ_O World[0][1]= 2X_OY_O + 2W_OZ_O World[0][2]= 2X_OZ_O - 2W_OY_O World[1][0]= 2X_OY_O - 2W_OZ_O World[1][1]= 1 - 2X_OX_O - 2Z_OZ_O World[1][2]= 2Y_OZ_O + 2W_OX_O World[2][0]= 2X_OZ_O + 2W_OY_O World[2][1]= 2Y_OZ_O - 2W_OX_O World[2][2]= 1 - 2X_OX_O - 2Y_OY_O World[0][3]= X/1 World[1][3]= Y/1 World[2][3]= Z/1 World[3][3]= 1 World[0:3, 0:3] = np.transpose(World[0:3,0:3]) return World

def world2quatr(World): # tr = World[0][0] + World[1][1] + World[2][2] if (tr > 0): S = np.sqrt(tr+1.0) 2# // S=4qw qw = 0.25 S; qx = (World[2][1] - World[1][2]) / S; qy = (World[0][2] - World[2][0]) / S; qz = (World[1][0] - World[0][1]) / S; elif ((World[0][0] > World[1][1])and(World[0][0] > World[2][2])): S = np.sqrt(1.0 + World[0][0] - World[1][1] - World[2][2]) 2; # S=4qx qw = (World[2][1] - World[1][2]) / S; qx = 0.25 S; qy = (World[0][1] + World[1][0]) / S; qz = (World[0][2] + World[2][0]) / S; elif (World[1][1] > World[2][2]): S = np.sqrt(1.0 + World[1][1] - World[0][0] - World[2][2]) 2; # S=4qy qw = (World[0][2] - World[2][0]) / S; qx = (World[0][1] + World[1][0]) / S; qy = 0.25 S; qz = (World[1][2] + World[2][1]) / S; else: S = np.sqrt(1.0 + World[2][2] - World[0][0] - World[1][1]) 2; # S=4qz qw = (World[1][0] - World[0][1]) / S; qx = (World[0][2] + World[2][0]) / S; qy = (World[1][2] + World[2][1]) / S; qz = 0.25 S; quaternion=np.array([qw, qx, qy, qz]) return quaternion

ShrutheeshIR commented 4 years ago

What is Coordinates argument in the first function? Why does it have 6 values?

YJonmo commented 4 years ago

It has seven values. It is the quaternion coordinates [x y z qw qx qy qz ]

ShrutheeshIR commented 4 years ago

Oh makes sense. That's awesome! Thanks

Ademord commented 3 years ago

@YJonmo i am trying to do something like this with unity, could you help me get my depth input into this repo's software?

YJonmo commented 3 years ago

@YJonmo i am trying to do something like this with unity, could you help me get my depth input into this repo's software?

What do you exactly want to do? like creating an extended 3D map using several depth frames? Do you have the depth frames and their corresponding camera coordinates?

GodZarathustra commented 2 years ago

Yes mate, finally solved. There is a transformation conversion needed: VSLAMMappingFromBlender2DSO

Please check this repo: https://github.com/GSORF/Visual-GPS-SLAM/blob/master/02_Utilities/BlenderAddon/addon_vslam_groundtruth_Blender280.py#L34

Hi mate, I also have the same issues as you did. but the link you provided doesn't work for my side, actually, I am not sure what coordinate system is being used for this repo, could you provide the conversion matrix from blender coordinates to this repo's coordinates? Thank you so much!

YJonmo commented 2 years ago

Yes mate, finally solved. There is a transformation conversion needed: VSLAMMappingFromBlender2DSO Please check this repo: https://github.com/GSORF/Visual-GPS-SLAM/blob/master/02_Utilities/BlenderAddon/addon_vslam_groundtruth_Blender280.py#L34

Hi mate, I also have the same issues as you did. but the link you provided doesn't work for my side, actually, I am not sure what coordinate system is being used for this repo, could you provide the conversion matrix from blender coordinates to this repo's coordinates? Thank you so much!

Hi mate, You need to see what your coordinate is. Like what is the forward, what is upward or leftward. Then you need to rotate your coordinates so that it matches with TSDF coordinates.

You could use these rotation/mirroring functions to perform your conversion: https://github.com/YJonmo/Job/blob/main/Transformations.py

jly0810 commented 1 year ago

是的,兄弟,终于解决了。需要进行转换转换:VSLAMMappingFromBlender2DSO

请查看此仓库:https ://github.com/GSORF/Visual-GPS-SLAM/blob/master/02_Utilities/BlenderAddon/addon_vslam_groundtruth_Blender280.py#L34

Hello, you and I have encountered similar problems. I also obtained data in blender, as shown in the figure below. I know the rotation and translation of the camera, and can calculate the rotation matrix and translation matrix from this. Then how can I convert to get the pose matrix suitable for the project. I hope you can answer the question. He has bothered me for a long time 9bba7f65221e938f002350d2816258b

LINFF1023 commented 1 year ago

Hi there,

Thanks for putting this work in public.

My question may sound silly, but do I need to have the camera pose to be able to use this repository? that's my impression by going through your codes.

What I have is just bunch of RGB-D images and I would like to fuse them to each other get the extended map.

Regards, Jacob

hi,I'm sorry to bother you so late.I now have the absolute position XYZ and quaternion QWER for each image frame.How do I map these 7 numbers to its 4×4 matrix?

YJonmo commented 1 year ago

Hi there, Thanks for putting this work in public. My question may sound silly, but do I need to have the camera pose to be able to use this repository? that's my impression by going through your codes. What I have is just bunch of RGB-D images and I would like to fuse them to each other get the extended map. Regards, Jacob

hi,I'm sorry to bother you so late.I now have the absolute position XYZ and quaternion QWER for each image frame.How do I map these 7 numbers to its 4×4 matrix?

I have not worked with this for a long time. You might need to use the demo.py in this repo and replace your data with the demo data.

LINFF1023 commented 1 year ago

Hi there, Thanks for putting this work in public. My question may sound silly, but do I need to have the camera pose to be able to use this repository? that's my impression by going through your codes. What I have is just bunch of RGB-D images and I would like to fuse them to each other get the extended map. Regards, Jacob

hi,I'm sorry to bother you so late.I now have the absolute position XYZ and quaternion QWER for each image frame.How do I map these 7 numbers to its 4×4 matrix?

I have not worked with this for a long time. You might need to use the demo.py in this repo and replace your data with the demo data.

Thank you for your reply. I used my own dataset, but here's what happened: Voxel volume size: 2565 x 4061 x 1767 - # points: 18,405,893,655. Very large numbers crash the program, but my dataset is not large. What caused this? Looking forward to your reply.

YJonmo commented 1 year ago

I am not sure, I remember it was crashing for me too. You could try to reduce the size. Also you could try other repositories: https://github.com/kevinzakka/torchsdf-fusion https://github.com/PRBonn/vdbfusion https://github.com/charlesCXK/TorchSSC

LINFF1023 commented 1 year ago

I am not sure, I remember it was crashing for me too. You could try to reduce the size. Also you could try other repositories: https://github.com/kevinzakka/torchsdf-fusion https://github.com/PRBonn/vdbfusion https://github.com/charlesCXK/TorchSSC

Thank you for your reply. I have found the reason. Many thanks!