andyzeng / tsdf-fusion-python

Python code to fuse multiple RGB-D images into a TSDF voxel volume.
http://andyzeng.github.io/
BSD 2-Clause "Simplified" License
1.25k stars 219 forks source link

How to get camera poses for each frame ? #5

Open euzer opened 5 years ago

euzer commented 5 years ago

HELLO there, Someone can tell me how can to get RGB-D, Depht frames and camera poses

yyxr75 commented 5 years ago

if you dont have data, i suggest you using the original data offered by this demo

euzer commented 5 years ago

Yeah, off course i have already used the given original data. My goal is to reconstruct more as possible 3D mesh. I have a structure sensor that can also give me color , depht etc. But, the fact is that i don't know how to capture those informations.

Thanks !!

yyxr75 commented 5 years ago

Yeah, off course i have already used the given original data. My goal is to reconstruct more as possible 3D mesh. I have a structure sensor that can also give me color , depht etc. But, the fact is that i don't know how to capture those informations.

Thanks !!

  1. try the whole demo here
  2. it tells you how the pose of each point cloud comes
vgg4resnet commented 5 years ago

I have the same question ! I search in your code and the demo , but I can not find where pose.txt comes from ,can u simply tell us how it produce , and refer me in which line the related code locates ,thanks so much

kingsman142 commented 5 years ago

1) RGB-D frames can be collected with a a depth camera, such as the Kinect, or a Realsense camera. For each frame, they will output 1 RGB image and 1 Depth image.

2) Camera poses can be gathered several different ways. If you look up KinectFusion, it utilizes the RGBD images to estimate the pose of the camera. However, in a paper by Andy's lab, they mention they use a robotic arm with 7 degrees of freedom, along with its kinematics, to estimate the camera's pose. This TSDF fusion code does not automatically generate the poses by itself. It might be useful to refer to their Dense Object Nets paper: https://arxiv.org/pdf/1806.08756.pdf .

vgg4resnet commented 5 years ago

yes ,thank u so much, @kingsman142 @andyzeng I have find some code to obtain pose from the following project,but i dont know which one is better,I also have some questions : 1.https://github.com/openMVG/openMVG/blob/master/src/openMVG_Samples/multiview_robust_essential/robust_essential.cpp 2.https://github.com/gaoxiang12/ORBSLAM2_with_pointcloud_map/blob/master/ORB_SLAM2_modified/src/System.cc

        cv::Mat Tcw = (*lit)*Trw;
        cv::Mat Rwc = Tcw.rowRange(0,3).colRange(0,3).t();
        // twc: world coordinate
        cv::Mat twc = -Rwc*Tcw.rowRange(0,3).col(3);

seems the Tcw we want right?which is optimized by BA and pose graph ,so the BA and pose graph make pose more accurate? // vector q = Converter::toQuaternion(Rwc); // 3.https://github.com/intel-isl/Open3D/blob/master/examples/Python/ReconstructionSystem/make_fragments.py

if abs(s - t) is not 1:
        if with_opencv:
            success_5pt, odo_init = pose_estimation(source_rgbd_image,
                                                    target_rgbd_image,
                                                    intrinsic, False)
            if success_5pt:
                [success, trans, info] = o3d.odometry.compute_rgbd_odometry(
                    source_rgbd_image, target_rgbd_image, intrinsic, odo_init,
                    o3d.odometry.RGBDOdometryJacobianFromHybridTerm(), option)
                return [success, trans, info]
        return [False, np.identity(4), np.identity(6)]

about the trans,odo_init which is we want?should I need optimized pose by BA and pose graph and get the pose.

4.https://github.com/andyzeng/apc-vision-toolbox/blob/master/rgbd-utils/matlab/getCalib.m can I use this method, I have some questions how to get mapping squence? % A mapping from one frame to its nearest frame in the sequence (these % mappings are predefined for shelf and tote; if a frame is mapped to 0, % its camera-to-world extrinsic matrix is set as the pivot transformation)

my data are human faces, so the frontal face can be picked as the pivot frame ,right,the pivot frame's pose is all zeros, but I look through your kitchen data ,there is no frame's pose is zero ,do I misunderstand,so in kitchen datas which frame is the pivot frame.

newExtCam2World{nearestFrameIdx(1,frameIdx)} = newExtCam2World{nearestFrameIdx(2,frameIdx)} * [initRt; [0 0 0 1]]; newExtCam2World is all we want?

look forward ur reply !thank u in advanced!

kingsman142 commented 5 years ago

I'm unsure what you mean by BA. I assume Tcw is Translation (camera-to-world) and Rwc is Rotation (world coordinate), but those are guesses. I'm unfamiliar with all of those code chunks, which may have heavier underlying math implementations that are more important. Sorry, I can't help. @andyzeng is probably the only one that might be able to help you, especially considering he helped create the apc-vision-toolbox repository.

However, I can kind of answer your question:

my data are human faces, so the frontal face can be picked as the pivot frame ,right,the pivot frame's pose is all zeros, but I look through your kitchen data ,there is no frame's pose is zero ,do I misunderstand,so in kitchen datas which frame is the pivot frame.

In this scenario, no frame pose has to be zero. If you make the first frame's pose the base frame (or "pivot frame"), then you can calculate the relative pose to that base frame. So, in this case, it isn't important.

vgg4resnet commented 5 years ago

@kingsman142 ok ,anyway , thanks very much ,the BA I mean the bundle ajustment

YJonmo commented 4 years ago

I am a bit confused with the pose matrix.

I have bunch of simulated frames with Blender with corresponding depth and pose. The pose is the camera work matrix and has below structure:

cam.matrix_world Matrix(((-0.9429, 0.0142, -0.3325, -5.695), (-0.3328, -0.0418, 0.942, 34.5783), (-0.0005, 0.9990, 0.0441, -4.5873), (0.0, 0.0, 0.0, 1.0))) The last column is the x, y, z translation.

But when I use these pose info then this repository doesn't seem working. I tried to get the pose from OpenCV KinectFusion for the depth but the output pose was in a different format where the last row is the translation. Still using this info I could not get the repo working.

[0.99131632, -0.075547278, 0.10762857; 0.085346401, 0.99231762, -0.089551769; -0.10003643, 0.097959846, 0.99014962] [0.728273, -0.0614177, -0.0695721]

Any help please?