training with pre-generated depth

maturk / dn-splatter

DN-Splatter: Depth and Normal Priors for Gaussian Splatting and Meshing

https://maturk.github.io/dn-splatter/

Apache License 2.0

404 stars 25 forks source link

training with pre-generated depth #61

Closed deephog closed 1 day ago

deephog commented 2 weeks ago

Thank you for your great work and this is very exciting to try it out and see if depth and normal supervision can help with floaters in my own dataset.

The thing I'm trying to figure out is that, I have a depth model that works quite well independently and I have been using it for long. What will be the workflow if I want to use my own depth data? I assume for the model to be able to run, everything will still need to fall back to COLMAP convention, and with all the alignment stuff. Is there an easier way to load pre-computed depth maps, and still be compatible with rest of your pre-processing code?

Thanks!

maturk commented 2 weeks ago

Hey @deephog, the workflow to use depth maps (assuming they are aligned with RGB images) assumes that you have some kind of transforms.json (name can be different) that specifies the needed data.

E.g. if you download the mushroom dataset (use python dn_splatter/data/download_scrips/mushroom_download.py) you will see a good example. The .json file just specifies the camera intrinsics, extrinsics, where the rgb frames are stored and where the sensor depth data are stored.

COLMAP convention is not needed in this scenario. But it is also a supported option... Actually this repo is very closely tied to the conventions used in Nerfstudio. More details here. If you are a Nerfstudio user, then using this repo should work quite easily... at least I hope.

Let me know if you need more details.

maturk commented 2 weeks ago

Btw are they sensor depths or monocular depth estimates?

deephog commented 2 weeks ago

Wow, thank you for your prompt response!

I'm not too familiar with the depth part of Nertstudio either, and actually I found your repo when I was roaming in Issues of Nerfstudios trying to figure out how to apply depth with less pain.

The thing is, I do have a depth model that I want to try, say, it is just another Mono-depth model, I ran it before hand to the image sequence, so now I have two folders, Images/ and Depths/ . I understand when you said COLMAP is not necessary here, but I assume I cannot just run the training code from those two folders right? There must be some pre-processing script that I need to run to make the data structure compatible to training. If that is the case, which script should I use from this point on?

Or if there is no such script that can handle this kind of scenario, could you please let me know which pre-processing script should I integrate my depth model in so it can take care of the rest and generate a compatible data structure?

Thanks!

maturk commented 2 weeks ago

Got it. You can look into the scripts/align_depth.py script in that case which aligns the colmap reference frame with that of estimated depths.

Note, the depths have to be linear/metric scale, not relative depths. What model did u use to generate monocular depths? (Just wondering since some SotA networks output relative depths which are like 1/depth)

deephog commented 2 weeks ago

I will work from that script, thank you!

https://github.com/lpiccinelli-eth/unidepth

Looks like it is SotA, I will try and see how it works with your model

maturk commented 2 weeks ago

Probably not too well, from my experiments monocular depth supervision is very challenging with GS compared to sensor depth supervision.

maturk commented 1 day ago

@deephog, you can try the Pearson Correlation loss (from SparseGS) in the lates updates. This loss is a relative loss, so no need to use any scale alignment and from our experiments, it performs better than the scale aligned variant.

To use it, use --pipeline.model.depth-loss-type PearsonDepth in your config.