dazinovic / neural-rgbd-surface-reconstruction

Official implementation of the CVPR 2022 Paper "Neural RGB-D Surface Reconstruction"
https://dazinovic.github.io/neural-rgbd-surface-reconstruction/
582 stars 59 forks source link

confusion on the translation = [-4.44, 0, 2.31] #4

Closed chensjtu closed 2 years ago

chensjtu commented 2 years ago

I'm a little confused about the translation in config. For instance, in configs/scene0050_00.txt there is a config named translation. Each config contains a different translation, what is the effect of the translation?

dazinovic commented 2 years ago

I translate and scale each scene so it roughly lies in a [-1, 1] cube. I did this manually by opening each scene in Meshlab and looking at the center of the bounding box, but you could also do it automatically by computing the bounding box for all the cameras. I haven't tested if this actually improves the results, but at least it's a bit more convenient to evaluate the SDF after training.

chensjtu commented 2 years ago

OK, thanks for your kindly reply!

chensjtu commented 2 years ago

Hey, dazinovic. I'm a new learner of 3D vision. When I use the pose in trainval_poses to do the tsdfusion, I got terrible reconstruction effects. Does there any notice for me to mind? maybe I should transfer the openGL pose to openCV pose? if so, please tell me how to transfer! Many thanks!

chensjtu commented 2 years ago
image

this is the result of "breakfast room"

dazinovic commented 2 years ago

You definitely need to transform the poses into whichever coordinate system your tsdffusion code uses. Can you try pre-multiplying by:

[[1, 0, 0, 0], [0, 0, -1, 0], [0, 1, 0, 0], [0, 0, 0, 1]] ?

If it still doesn't work, try swapping some columns of the pose matrix or changing the sign of some of the columns. It looks like in OpenCV the y-axis points downward and the z-axis forward, so you can try flipping (changing the sign) the 2nd and 3rd columns.

chensjtu commented 2 years ago

Thanks for your kindly reply! but still, I have trouble with the pose conversation. Can you share the blender rendering code sothat I can understand the difficult coord operations. BTW, what is the difference between pose.txt and the train_val_pose.txt? I use the pose0 in blender, but the rendered image is not eq to the image0.png. for instance, 1 0 0 0 0 0 -1 0 0 1 0 0 0 0 0 1 is the first pose in breakfast room, but cam loc in 0,0,0 is not reasonable! image

chensjtu commented 2 years ago

I just use the code from"https://github.com/andyzeng/tsdf-fusion-python", with little modification on file path and pose loader.

dazinovic commented 2 years ago

To go from OpenGL to Blender coordinates, you can use the following code:

import sys
import numpy as np

def load_poses(posefile):
    file = open(posefile, "r")
    pose_floats = [[float(x) for x in line.split()] for line in file]
    file.close()
    lines_per_matrix = 4
    all_poses = [ [pose_floats[lines_per_matrix*idx + 0], pose_floats[lines_per_matrix*idx + 1], pose_floats[lines_per_matrix*idx + 2], pose_floats[lines_per_matrix*idx + 3]] for idx in range(0, len(pose_floats)//lines_per_matrix) ]
    return all_poses

if __name__ == '__main__':
    posefile = sys.argv[1]

    poses = load_poses(posefile)
    poses = np.array(poses).astype(np.float32)

    for pose in poses:
        # Swap y and z axis
        pose[[1, 2], :] = pose[[2, 1], :]

        # Invert y-axis
        pose[1, :] *= -1

    poses = np.reshape(poses, [-1, 16])

    dst_file = sys.argv[2]
    np.savetxt(dst_file, poses, fmt='%.6f')

Some of the Blender scenes had an unreasonable size, so I scaled them down first (IIRC I scaled the breakfast room by a factor of 0.35).

poses.txt contains the ground truth poses in the OpenGL coordinate system (the poses I used to render the scenes). trainval_poses.txt contains BundleFusion's estimated poses that I use as the initial poses in my method.

dazinovic commented 2 years ago

blender_poses.zip

These are the poses I used for Blender.

chensjtu commented 2 years ago

So the pose_array@trainval_poses.txt is the final pose used in all exp? the optimized pose is pretty far away from the pose provided in pose.txt. Can you tell me how to cal the table2 in your paper ?

image

I get the optimized pose with :

image

while the pose.txt, it is:

image

I gauss you use the relative pose error estimation?

chensjtu commented 2 years ago

So if I want to use your dataset to conduct tsdf fusion, I can use the pose in pose.txt and transfer it to OpenCV's definition. Then I ought to get what I want?

dazinovic commented 2 years ago

So the pose_array@trainval_poses.txt is the final pose used in all exp?

Kind of. They are not in the same space, though. You can check extract_optimized_poses.py to see how the conversion works.

the optimized pose is pretty far away from the pose provided in pose.txt. Can you tell me how to cal the table2 in your paper ?

You need to align the two trajectories. I did that by aligning the first camera of both trajectories. A better way might be to actually solve an optimization problem that finds the best alignment, but I don't believe it would change the reported numbers significantly.

So if I want to use your dataset to conduct tsdf fusion, I can use the pose in pose.txt and transfer it to OpenCV's definition. Then I ought to get what I want?

You need to convert it into whatever tsdf_fusion uses.

chensjtu commented 2 years ago

You need to align the two trajectories. I did that by aligning the first camera of both trajectories. A better way might be to actually solve an optimization problem that finds the best alignment, but I don't believe it would change the reported numbers significantly.

Thanks for the detailed response. I've gotten the method of processing trajectories.

You need to convert it into whatever tsdf_fusion uses.

I've tried different combinations of poses, and finally, I got what I want. Many thanks again for your kindness and fantastic work.

ZirongChan commented 2 years ago

@chensjtu hi, how did you solve the alignment between different poses files? I was trying to align the recovered point cloud from each frame to get a complete sence but yet failed, the breakfast_room for instance. Which pose file should I use, just to put the point clouds together ? Any suggestions ? @dazinovic

soumyadipm commented 2 years ago

Can you guys please help me to run this code on ScanNet? Struggling for last two weeks.

soumyadipm commented 2 years ago

Can you guys please help me to run this code on ScanNet? Struggling for last two weeks.