Open angusev opened 2 years ago
Thanks for your kind words!
The coordinate transforms are always tricky, and different datasets have different conventions.
Here are some high level comments.
We use nvdiffrast for differentiable rasterization, which uses OpenGL conventions, as discussed here: https://nvlabs.github.io/nvdiffrast/#coordinate-systems
Here are a few slides illustrating the OpenGL mv and projection setup. https://fileadmin.cs.lth.se/cs/Education/EDA221/lectures/latest/Lecture5_web.pdf Note that in OpenGL, the camera looks along the negative z axis.
Here is a simple example of how the model view matrix is setup in our code: https://github.com/NVlabs/nvdiffrec/blob/main/dataset/dataset_mesh.py#L55
You could also look at InstantNGPs colmap2nerf script: https://github.com/NVlabs/instant-ngp/blob/master/scripts/colmap2nerf.py
which should generate transform matrices compatible with our nerf dataset reader: https://github.com/NVlabs/nvdiffrec/blob/main/dataset/dataset_nerf.py
It may not work 100% out of the box (image path etc), but the transform matrices should be compatible at least.
First and foremost, thank you for your excellent work!
My goal is to apply your approach to real-world pictures of an object. All of the frames are properly masked so I provided your pipeline with a precise alpha channel. Besides that, there are accurate World2Camera transformations (Rotation
R
, translationt
and scales
) for each frame. Thus, once we assume that camera is located in origin, my rough origin-centered approximation of an object's geometry with verticesV_w
can be aligned in a frame in the next way:To reverse transformation to Camera2World, we can compute reverse rotations and translations (since
R
is orthogonal):Thus, I tried to adapt my transforms to the repo's code by constituting extrinsic matrix
mv
in the method_parse_frame
https://github.com/NVlabs/nvdiffrec/blob/3e7007ca0f504008e89eb9a46907cf39ed166117/dataset/dataset_llff.py#L87 in the following manner:This approach didn't work and optimised geometry didn't even appear in frames' renders. I'd like to ask you if I understood correctly the meanings of
mv
andcampos
and my conversion from World2Camera to Camera2World space is fair. Thanks!