Open jlartois opened 9 months ago
Hi. Based on my experience, I think the depth range is very important sometimes. I would suggest tuning the near, far vaules (e.g. 2.885458 39.362592 as you posted). First, I felt the near value may be too large, this may be the reason that the bottom part of the depth map is bad (they are very close to camera, maybe their depth is smaller than 2.88 and then the network cannot estimate meaning depth values). Second, the far value may be too large as well. You can manually crop it to a constant value. For debugging, you may also try other MVS methods on this same dataset, e.g. my another work IterMVS.
Hi, thanks for the quick reply. I can see how the depth range is vital for implementations like this. I indeed also had to experiment with the depth range for other MVS networks. However, in this case, changing the depth range seems to not not help.
Here is a depth map when setting the depth min and max values for each cam.txt to [1.0, 20.0] (note, the depth map is darker, but this has to do with my choice of min and max when converting to a png, not with the "correctness" of the estimated depth):
TL;DR: the depth of everything but the bench remains as bad as with the original depth range.
For completeness, here is the corresponding confidence map:
I think the confidence map illustrates how PatchMatchNet is struggling to find a good depth, except for the bench.
Btw, I also tried a depth range of [0.5, 10.0], but the depth maps were worse (meaning even further from the "ground truth", more noise, less geometric consistency).
For anyone curious of what the PatchMatchNet results look like on an example, real-world dataset, see this issue. I took 46 1920x1080 images around a bench. I have experience with MVS, so I take extreme care to fix any camera sensor parameters and minimize motion blur. The 46 images can be seen/downloaded here. Hi, I am confused about how masks are generated during the training of a customized dataset.@jlartois @FangjinhuaWang
Thank you very much for providing this code. I was able to get it running pretty smoothly with the instructions provided. The fused.ply look fine, but the depth maps look bad.
For anyone curious of what the PatchMatchNet results look like on an example, real-world dataset, see this issue. I took 46 1920x1080 images around a bench. I have experience with MVS, so I take extreme care to fix any camera sensor parameters and minimize motion blur. The 46 images can be seen/downloaded here.
The colmap camera calibration looks fine:
The colmap fused.ply confirms that the calibration is good:
For completeness, instant-ngp (NeRF) also is able to correctly reconstruct the scene using the camera params from colmap:
These are the results of PatchMatchNet:
pair.txt looks perfect, so this is not an issue.
fused.ply looks okay in some places, others not so much:
but the depth maps look far from state-of-the-art. Here is an example (some other depth maps are better, some worse, I choose a medium quality result here):
For completeness, here is the contents of the cam.txt file:
Is this what you would expect? What would you recommend to get better depth maps?