Closed endlesswho closed 1 year ago
I would also love a depth supervision feature, as this tends to increase the exported mesh quality dramatically. This would require modifying the data loading/ conversion scripts to support additional depth images.
Hi! We haven't added this feature, but are open to it. Feel free to make a PR or otherwise we'll get to it at some point. It should be straightforward where you create a new dataset like we did for semantic information. Here are the steps.
(1) populate metadata in DataparserOutputs like here.
(2) create an InputDataset that can handle this metadata like here.
(3) Specify a DataManager that uses this dataset like here.
Lastly, make sure your method is using this DataManager. See the semantic-nerfw config for an example.
According to https://github.com/nerfstudio-project/nerfstudio/issues/1028, the current NeRFStudio conventions might not line up with those provided by depth maps used for supervision, which might lead to inconsistencies during training. It might be worth adapting the codebase to compute depth as the z-distance instead of the actual euclidian distance.
Considering cameras initially computes direction vectors with v[2] == 1 for perspective cameras at least, one potentially suboptimal option would be to save the scaling factor computed https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/cameras/cameras.py#L669 as ray metadata, which can then be used when computing depth. I'm happy to make a PR if that's of interest
Yea, I guess that makes the most sense for now. It's a little unfortunate that we would need to save this extra data when most models won't use it. But the overhead should be marginal. Feel free to make a PR.
For anybody interested in depth supervision in NeRFStudio, I started working on depth supervison here: https://github.com/Frenchman997/nerfstudio/tree/depth_supervision
For simplicity my depth images contain depth values along the ray direction and not in z-direction.
The implementation is quite crude at the moment but might be a good starting point for anybody interested in this topic. I probably don't have the time to make this a full feature on my own so any interest in collaborating on this is welcome.
I have a question regarding the benefits and drawbacks of z-depth over euclidean depth:
Correct me if I'm wrong but would it not make more sense to convert your depth data from z-depth to euclidean depth when loading depth images? All in camera depth information relies on calibrated camera intrinsics anyway so I see no drawback in converting the distance before starting the NeRF training. This would also make it easier to add other forms of depth sensors like LiDARs that directly output euclidean distances.
The only benefit of z-depth that I can think of is if you want to blend your NeRF with traditionally rendered meshes to create some sort of mixed reality visualization using the z-buffer to handle occlusion.
I made a new PR with a preliminary implementation. Please take a look: https://github.com/nerfstudio-project/nerfstudio/pull/1173 🙂.
Closed as #1173 was merged.
Closed as #1173 was merged.
Awesome works! Could you privide your datasetes for test?
Great work! Record3d generate aligned depth images. Can you update a depth supervision feature with captured depth images for better results?