Closed kevinchiu19 closed 4 days ago
Hi @kevinchiu19, I think the pixelSplat team has provided a nice answer to this question at https://github.com/dcharatan/pixelsplat/issues/82, I quote their answer below for the reference of anyone who might share the same concern.
Depth can either be defined as distance along the ray or as Z depth (distance along the camera look vector/Z coordinate in camera space). Since depth is predicted by a neural network, the convention that's being used doesn't matter—the network will simply learn whatever convention is being used.
If you want to switch to the other convention (Z depth), you can replace https://github.com/donydchen/mvsplat/blob/378ff818c0151719bbc052ac2797a2c769766320/src/geometry/projection.py#L105 with the following:
directions = directions / directions[..., -1:]
This will normalize by the Z coordinate instead of by ray length.
Okay, thank you again for your great work!
Hi @kevinchiu19, I think the pixelSplat team has provided a nice answer to this question at dcharatan/pixelsplat#82, I quote their answer below for the reference of anyone who might share the same concern.
Depth can either be defined as distance along the ray or as Z depth (distance along the camera look vector/Z coordinate in camera space). Since depth is predicted by a neural network, the convention that's being used doesn't matter—the network will simply learn whatever convention is being used. If you want to switch to the other convention (Z depth), you can replace https://github.com/donydchen/mvsplat/blob/378ff818c0151719bbc052ac2797a2c769766320/src/geometry/projection.py#L105 with the following:
directions = directions / directions[..., -1:]
This will normalize by the Z coordinate instead of by ray length.
Thanks for sharing your great effort and work.
I have a little question here in src/model/encoder/common/gaussian_adapter.py:
Compute Gaussian means.
The directions used in the calculation here has been normalized, which will cause the calculation result to not be the correct world coordinate system.
For example, in the autonomous driving scene, the xyz of the point cloud is directly used as the gaussian position. Here, normalization and non-normalization are used. The results are as shown in the figure below.![image](https://github.com/donydchen/mvsplat/assets/52447673/c18edf25-5f9d-4095-882d-dbd30f5b7fc3)