Inquiry on Adapting NeRF Coordinate System to OpenCV Style for Custom Dataset Training

donydchen / mvsplat360

🎞️ [NeurIPS'24] MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views

MIT License

120 stars 2 forks source link

Hi @donydchen,

Thank you for your fantastic work!

You mentioned that if we want to train on our own dataset, the coordinate system should follow the OpenCV style, where: The X-axis points to the right, The Y-axis points downwards, The Z-axis points in the direction the camera faces (towards the screen).

I would like to inquire about how to train on my own dataset, which uses the coordinate system from the NeRF repository, defined as: The X-axis points to the right, The Y-axis points upwards, The Z-axis points behind the image (i.e., the camera is facing away from the screen).

How should I modify this to match the OpenCV style?

Alternatively, is there any other approach to modifying the code so that I can train on my own dataset?

Thank you very much for your fantastic work, looking forward to your response.

Hey @YaoXingbo, this can typically be done by multiplying the pose (c2w matrix) with a transformation matrix, similar to blender2opencv as done in MatchNeRF. However, I suggest you confirm this further with ChatGPT, which could save you a lot of time.

Once it is converted, you can verify its correctness by plotting the epipolar line, as detailed in this comment. You can verify it on the MVSplat project, which might be easier. And it uses the exact same camera coordinate system as this project.

Typically, there is not much to be changed for another dataset. Ensure you also convert your datasets to the torch format similar to ours. You can refer to more details from convert_dl3dv.py and convert_dtu.py. You can also find more references for different data loaders from MVSplat and NoPoSplat, which all uses the same code base.

Have fun tuning. Cheers!

donydchen / mvsplat360

Inquiry on Adapting NeRF Coordinate System to OpenCV Style for Custom Dataset Training #3