NVlabs / instant-ngp

Instant neural graphics primitives: lightning fast NeRF and more
https://nvlabs.github.io/instant-ngp
Other
15.86k stars 1.91k forks source link

question about the pose transformation in colmap2nerf.py #1486

Open flybirdtian opened 9 months ago

flybirdtian commented 9 months ago

I have a question about the transformation in lines 351-354 of colmap2nerf.py, I understand it tries to do some transformations about the coordinates, and I understand every line of the code, but I still cannot understand the principle behind it, can someone give some explanations for them, thanks very much!

                            if not args.keep_colmap_coords:
                c2w[0:3,2] *= -1 # flip the y and z axis
                c2w[0:3,1] *= -1
                c2w = c2w[[1,0,2,3],:]
                c2w[2,:] *= -1 # flip whole world upside down
                up += c2w[0:3,1]

I mainly cannot understand the following 4 lines:

                c2w[0:3,2] *= -1 # flip the y and z axis
                c2w[0:3,1] *= -1
                c2w = c2w[[1,0,2,3],:]
                c2w[2,:] *= -1 # flip whole world upside down

The first and second lines try to flip the z-axis and y-axis respectively, and the third line switches the x and y cocoordinates and the fourth line flip upside down by multipe -1 at the third row of transformation matrix, but how it works? can someone give an explanation, thank you!

yjb6 commented 9 months ago

hello,I have the same question. Do you have an idea?

SleepEaaarly commented 6 months ago

I just got an idea about this question. Here I am discussing this issue with you. The key to understand the four lines are relationship between elementary transformations and elementary matrices. The first two lines are trying to negate the basis vector(y and z) of the camera coordinate. For better communication, we define a matrix A that has diagonal elements of 1 in the first and fourth rows, -1 in the second and third rows, and 0 for the rest of the elements. We can think of this process as a w2c left-side matrix A and then take the inverse. That is, the c2w matrix right-fold matrix A^{-1}(=A). That is, directly take negative values for the first and second columns of c2w. The last two lines are transformations of world coordinates. We could likewise analogy the row transformation with the row taking negative operation to the left-multiplication elementary matrix. Hope this explanation could help you! :)