Confusion about building my own transforms.json

mujiwob commented 1 year ago

Hi, I'm trying to use Realsense cameras with instant-ngp, but I'm having some confusion in building transforms.json. I use two cameras, T265 for pose and D435i for RGB and depth images. Their axes are shown below, where the left axis is T265's (same as NeRF) and the right axis is D435i's. axis2 axis1 I take the position of the camera when it starts as the origin of the world coordinates, and when the camera moves, I can get the rotation and translation information as follows. tranforms The world-to-camera (w2c) rotation and translation matrices is

import numpy as np
R_w2c = R_matrix = np.array(
        [[1-2*y*y-2*z*z, 2*x*y-2*w*z, 2*x*z+2*w*y],
         [2*x*y+2*w*z, 1-2*x*x-2*z*z, 2*y*z-2*w*x],
         [2*x*z-2*w*y, 2*y*z+2*w*x, 1-2*x*x-2*y*y]]
        )
T_w2c = np.array([[translation.x,translation.y,translation.z]]).T

I find transforms.json needs the camera-to-world (c2w) transform matrix, so

R_c2w = R_w2c.T
T_c2w = - np.dot(R_c2w, T_w2c)
transform_matrix = [[R_c2w T_c2w],
                    [0 0 0 1]]

My questions are: 1.Should I transform the depth coordinates to the T265 (NeRF) coordinates? 2.I set the integer_depth_scale to the D435i's depth scale (about 0.001), which means depth * depth_scale = real_distance_in_meters. I don't know if I should scale it. 3.Should I scale or translate the transform matrix to let the origin of the world coordinates be the center of the scene? For example, if I want to reconstruct a room, should I need to let the origin of the world coordinates be the center of the room? Thanks!

elenacliu commented 1 year ago

I got the same question, have you solved that?

meriemjabri commented 1 year ago

@woblitent did you solve it ?

NVlabs / instant-ngp

Confusion about building my own transforms.json #1257