Closed cici19850 closed 1 week ago
Hey, it is difficult to say what could be a problem. The first thing I notice are the white walls, and these are quite difficult to regularize well with monocular depth regularization. How many images are in your dataset, and are they clear? (no motion blur)
Can you do a comparison with only splatfacto and compare with dn_splatter? Does dn_splatter do worse than splatfacto? You can get eval metrics using ns-eval command to see PSNR.
Hey, it is difficult to say what could be a problem. The first thing I notice are the white walls, and these are quite difficult to regularize well with monocular depth regularization. How many images are in your dataset, and are they clear? (no motion blur)
Can you do a comparison with only splatfacto and compare with dn_splatter? Does dn_splatter do worse than splatfacto? You can get eval metrics using ns-eval command to see PSNR.
Thank you very much for your prompt reply。 The training effect of splatfacto is as follows,There will also be many floating objects like fog。 Also, if I want to capture data similar to the Replica dataset, what equipment do I need to use for shooting。
Hi, I think maybe your "--normal-format" flag should be opengl, not opencv if using my scripts.
For debugging, I suggest first just enabling depth loss, and see if the PearsonDepth is improving the white walls or not. Then enable normal loss, and see if that issue is resolved.
Replica dataset is a "synthetic" dataset with ground truth depth, normal, and mesh data. It is unlikely that you will be able to make a similar quality dataset unless you have access to high precision laser scanners and your poses are millimiter good. If using a smartphone for capturing, make sure your images have little motion blur.
Hi, I think maybe your "--normal-format" flag should be opengl, not opencv if using my scripts.
For debugging, I suggest first just enabling depth loss, and see if the PearsonDepth is improving the white walls or not. Then enable normal loss, and see if that issue is resolved.
Replica dataset is a "synthetic" dataset with ground truth depth, normal, and mesh data. It is unlikely that you will be able to make a similar quality dataset unless you have access to high precision laser scanners and your poses are millimiter good. If using a smartphone for capturing, make sure your images have little motion blur.
Thank you very much, your reply has been very helpful to me. I will try using the advice you provided, thank you very much again.
Hi, I think maybe your "--normal-format" flag should be opengl, not opencv if using my scripts.
For debugging, I suggest first just enabling depth loss, and see if the PearsonDepth is improving the white walls or not. Then enable normal loss, and see if that issue is resolved.
Replica dataset is a "synthetic" dataset with ground truth depth, normal, and mesh data. It is unlikely that you will be able to make a similar quality dataset unless you have access to high precision laser scanners and your poses are millimiter good. If using a smartphone for capturing, make sure your images have little motion blur.
There is another issue, if using a smartphone for shooting and ensuring that the image has almost no motion blur. Is it correct to execute the following command sequence: 1、python dn_splatter/scripts/convert_colmap.py 2、python dn_splatter/scripts/normals_from_pretrain.py 3、python dn_splatter/scripts/align_depth.py 4、ns-train dn-splatter
Hi, I think maybe your "--normal-format" flag should be opengl, not opencv if using my scripts.
For debugging, I suggest first just enabling depth loss, and see if the PearsonDepth is improving the white walls or not. Then enable normal loss, and see if that issue is resolved.
Replica dataset is a "synthetic" dataset with ground truth depth, normal, and mesh data. It is unlikely that you will be able to make a similar quality dataset unless you have access to high precision laser scanners and your poses are millimiter good. If using a smartphone for capturing, make sure your images have little motion blur.
Will the following prompt have an impact?
@cici19850, the four commands you have ran are fine. You can skip step 1) if you use e.g. ns-process-data
or some other tool to process your camera poses. 2) just gives you normal estimates and 3) converts COLMAP dataset SfM points to scale aligned mono-depth estimates. For more information about 3) I suggest looking at this paper which does the same thing (they use gradient descent to solve for scale and shift, but instead in my script I use the closed form solution). The "average depth alignment error for batch depths is..." warning relates to this step. It calculates how much the SfM points disagree with the monocular depth estimates.
You can also skip step 3) and only run the python dn_splatter/scripts/depth_from_pretrain.py
command which only generates monocular depth estimates (using Zoe) and skips the Colmap alignment step. This is okay if you are using the PearsonDepth loss, which is a relative loss. For other loss functions, the scale alignment is necessary.
From my experience and experiments monocular depth supervision, even using PearsonDepth loss, does not perform as well as using e.g. iPhone ToF lidar data. Please see table below. So if you are hoping to make a very accurate (good geometry) indoor dataset, I highly recommend using a real depth sensor when capturing your scene. You can look at e.g. the MuSHRoom dataset for examples. This dataset was captured using iPhone camera with LiDAR.
Hello, thanks for the code. After setting up the environment, run the following command:
The training effect is as follows,Is there any way to improve the effect?