Open jinmaodaomaye2021 opened 1 year ago
Thanks for your reply!
The D(r) and N(r) are ground truth depth and normal maps. As described in "dataset" paragraph of Sec.7, we test our method on our synthetic dataset and some real dataset. For synthetic dataset, the ground truth depth and normal maps are readily available from the renderer, while for real dataset, ground truths are predicted similar to MonoSDF.
As for MonoSDF, we share the same source of depth and normal maps as ours in supervision to ensure fairness.
We will make our synthetic dataset publicly available soon. See our supplementary material for dataset details.
Thanks. Additional questions. 1) what about geometry reconstruction performance on scannet? The paper only provides comparison on synthetic data. 2) does i2sdf perform the same procedure as monosdf to solve the scale and shift when using public model depth as supervision.?
Thanks. 1) any quantitative evaluatiom on other real datasets? just curious how false positives on depth maps impact the bubble loss design.
Empirically, bubble loss has robustness against false positives to some extent, because of the smooth step in training (Sec.6). In our ablation studies, we add noises to our depth maps to simulate false positives.
However, in contrast, bubble loss will indeed be impacted by false negatives. For example, if the depth map provided from dataset misses a chair leg, our method may struggle in reconstructing the chair leg. We leave this as a future work.
yes, thats why we want to see the impact of depth map in real dataset. Though the abalation study has synthetic noise on depth maps, it is hard to compare with noise introduced by offshelf depth estimation models. Ground truth depth is not available in practice.
btw, how to generate the reconstruction result in fig 1? just want to learn to generate textured meshes.
Close this issue due to inactivity. Re-open it if you have further questions.
Hi there, a follow-up question: for real data (from [26] and [40]), are the depth maps acquired by rasterizing the provided mesh, or from off-the-shelf depth estimation models (e.g. DPT in MonoSDF)? And do those depth maps have correct absolute scale, or they are ambiguous in scale/shift and you will need to a shift/scale invariant depth loss?
Also can you explain what are the 1+3 real scenes from [26] and [40]? At the bottom of the website of [40], they only list one scene by them (Living Room2, which I cannot find download link to) and two from Free-viewpoint (Living Room1 and Sofa)?
Hi there, a follow-up question: for real data (from [26] and [40]), are the depth maps acquired by rasterizing the provided mesh, or from off-the-shelf depth estimation models (e.g. DPT in MonoSDF)? And do those depth maps have correct absolute scale, or they are ambiguous in scale/shift and you will need to a shift/scale invariant depth loss?
Also can you explain what are the 1+3 real scenes from [26] and [40]? At the bottom of the website of [40], they only list one scene by them (Living Room2, which I cannot find download link to) and two from Free-viewpoint (Living Room1 and Sofa)?
Real data's depths are estimated from MVS tools, so the depth maps have correct absolute scale and does not need scale shifting.
The real scenes of [26] and [40] are all calibrated by the authors of [40] (for the living room scene from [26], the re-calibration provides more precise depth and camera compared to the original version) in their experiments. They haven't made their dataset public yet, and I've been asking them for permissions to release their data I used in this repository.
Hi there, a follow-up question: for real data (from [26] and [40]), are the depth maps acquired by rasterizing the provided mesh, or from off-the-shelf depth estimation models (e.g. DPT in MonoSDF)? And do those depth maps have correct absolute scale, or they are ambiguous in scale/shift and you will need to a shift/scale invariant depth loss? Also can you explain what are the 1+3 real scenes from [26] and [40]? At the bottom of the website of [40], they only list one scene by them (Living Room2, which I cannot find download link to) and two from Free-viewpoint (Living Room1 and Sofa)?
Real data's depths are estimated from MVS tools, so the depth maps have correct absolute scale and does not need scale shifting.
The real scenes of [26] and [40] are all calibrated by the authors of [40] (for the living room scene from [26], the re-calibration provides more precise depth and camera compared to the original version) in their experiments. They haven't made their dataset public yet, and I've been asking them for permissions to release their data I used in this repository.
That's great news! Looking forward to the release of the real scene with calibrated depth. Also wondering if it is possible to release the tools to get dense MVS depth for those scenes (and third-party scenes)?
Hi there, a follow-up question: for real data (from [26] and [40]), are the depth maps acquired by rasterizing the provided mesh, or from off-the-shelf depth estimation models (e.g. DPT in MonoSDF)? And do those depth maps have correct absolute scale, or they are ambiguous in scale/shift and you will need to a shift/scale invariant depth loss? Also can you explain what are the 1+3 real scenes from [26] and [40]? At the bottom of the website of [40], they only list one scene by them (Living Room2, which I cannot find download link to) and two from Free-viewpoint (Living Room1 and Sofa)?
Real data's depths are estimated from MVS tools, so the depth maps have correct absolute scale and does not need scale shifting. The real scenes of [26] and [40] are all calibrated by the authors of [40] (for the living room scene from [26], the re-calibration provides more precise depth and camera compared to the original version) in their experiments. They haven't made their dataset public yet, and I've been asking them for permissions to release their data I used in this repository.
That's great news! Looking forward to the release of the real scene with calibrated depth. Also wondering if it is possible to release the tools to get dense MVS depth for those scenes (and third-party scenes)?
They used CapturingReality to calibrate their scenes (reported in their paper), but I am not quite familiar with this field :)
Hi there, a follow-up question: for real data (from [26] and [40]), are the depth maps acquired by rasterizing the provided mesh, or from off-the-shelf depth estimation models (e.g. DPT in MonoSDF)? And do those depth maps have correct absolute scale, or they are ambiguous in scale/shift and you will need to a shift/scale invariant depth loss? Also can you explain what are the 1+3 real scenes from [26] and [40]? At the bottom of the website of [40], they only list one scene by them (Living Room2, which I cannot find download link to) and two from Free-viewpoint (Living Room1 and Sofa)?
Real data's depths are estimated from MVS tools, so the depth maps have correct absolute scale and does not need scale shifting. The real scenes of [26] and [40] are all calibrated by the authors of [40] (for the living room scene from [26], the re-calibration provides more precise depth and camera compared to the original version) in their experiments. They haven't made their dataset public yet, and I've been asking them for permissions to release their data I used in this repository.
That's great news! Looking forward to the release of the real scene with calibrated depth. Also wondering if it is possible to release the tools to get dense MVS depth for those scenes (and third-party scenes)?
They used CapturingReality to calibrate their scenes (reported in their paper), but I am not quite familiar with this field :)
Thanks! Also just to confirm, in order to get depth/normal maps on real scenes, did you rasterize with their provided mesh and poses? Scenes from Free-viewpoint do not include depth maps or normal maps; they only provide meshes and poses.
Hi there, a follow-up question: for real data (from [26] and [40]), are the depth maps acquired by rasterizing the provided mesh, or from off-the-shelf depth estimation models (e.g. DPT in MonoSDF)? And do those depth maps have correct absolute scale, or they are ambiguous in scale/shift and you will need to a shift/scale invariant depth loss? Also can you explain what are the 1+3 real scenes from [26] and [40]? At the bottom of the website of [40], they only list one scene by them (Living Room2, which I cannot find download link to) and two from Free-viewpoint (Living Room1 and Sofa)?
Real data's depths are estimated from MVS tools, so the depth maps have correct absolute scale and does not need scale shifting. The real scenes of [26] and [40] are all calibrated by the authors of [40] (for the living room scene from [26], the re-calibration provides more precise depth and camera compared to the original version) in their experiments. They haven't made their dataset public yet, and I've been asking them for permissions to release their data I used in this repository.
That's great news! Looking forward to the release of the real scene with calibrated depth. Also wondering if it is possible to release the tools to get dense MVS depth for those scenes (and third-party scenes)?
They used CapturingReality to calibrate their scenes (reported in their paper), but I am not quite familiar with this field :)
Thanks! Also just to confirm, in order to get depth/normal maps on real scenes, did you rasterize with their provided mesh and poses? Scenes from Free-viewpoint does not provide depth maps or normal maps; they only provide meshes and poses.
Depth map can be directly acquired from the MVS tools, or a rasterization is also OK I think I think evaluating normal maps using monocular learning-based methods (like MonoSDF or NeuRIS does) is more precise than the normal from MVS, the latter contains lots of noise
Hi there, a follow-up question: for real data (from [26] and [40]), are the depth maps acquired by rasterizing the provided mesh, or from off-the-shelf depth estimation models (e.g. DPT in MonoSDF)? And do those depth maps have correct absolute scale, or they are ambiguous in scale/shift and you will need to a shift/scale invariant depth loss? Also can you explain what are the 1+3 real scenes from [26] and [40]? At the bottom of the website of [40], they only list one scene by them (Living Room2, which I cannot find download link to) and two from Free-viewpoint (Living Room1 and Sofa)?
Real data's depths are estimated from MVS tools, so the depth maps have correct absolute scale and does not need scale shifting. The real scenes of [26] and [40] are all calibrated by the authors of [40] (for the living room scene from [26], the re-calibration provides more precise depth and camera compared to the original version) in their experiments. They haven't made their dataset public yet, and I've been asking them for permissions to release their data I used in this repository.
That's great news! Looking forward to the release of the real scene with calibrated depth. Also wondering if it is possible to release the tools to get dense MVS depth for those scenes (and third-party scenes)?
They used CapturingReality to calibrate their scenes (reported in their paper), but I am not quite familiar with this field :)
Thanks! Also just to confirm, in order to get depth/normal maps on real scenes, did you rasterize with their provided mesh and poses? Scenes from Free-viewpoint does not provide depth maps or normal maps; they only provide meshes and poses.
Depth map can be directly acquired from the MVS tools, or a rasterization is also OK I think I think evaluating normal maps using monocular learning-based methods (like MonoSDF or NeuRIS does) is more precise than the normal from MVS, the latter contains lots of noise
Yeah I agree. Just want to confirm which option was used in real scene experiments in I^2-SDF? Did you use semi-dense MVS depth by feeding images into a MVS pipeline, or rasterized depth/normals with the provided mesh?
W.r.t. to monocular depth (e.g. DPT depth in MonoSDF), I don't see the current code of I^2-SDF supporting scale/shift-invariant depth loss. I will try out DPT depth/normals with bubble loss but I am not sure whether things will just automatically work out if I plug in DPT depth/normals, or changes need to be made to the losses.
Hi there, a follow-up question: for real data (from [26] and [40]), are the depth maps acquired by rasterizing the provided mesh, or from off-the-shelf depth estimation models (e.g. DPT in MonoSDF)? And do those depth maps have correct absolute scale, or they are ambiguous in scale/shift and you will need to a shift/scale invariant depth loss? Also can you explain what are the 1+3 real scenes from [26] and [40]? At the bottom of the website of [40], they only list one scene by them (Living Room2, which I cannot find download link to) and two from Free-viewpoint (Living Room1 and Sofa)?
Real data's depths are estimated from MVS tools, so the depth maps have correct absolute scale and does not need scale shifting. The real scenes of [26] and [40] are all calibrated by the authors of [40] (for the living room scene from [26], the re-calibration provides more precise depth and camera compared to the original version) in their experiments. They haven't made their dataset public yet, and I've been asking them for permissions to release their data I used in this repository.
That's great news! Looking forward to the release of the real scene with calibrated depth. Also wondering if it is possible to release the tools to get dense MVS depth for those scenes (and third-party scenes)?
They used CapturingReality to calibrate their scenes (reported in their paper), but I am not quite familiar with this field :)
Thanks! Also just to confirm, in order to get depth/normal maps on real scenes, did you rasterize with their provided mesh and poses? Scenes from Free-viewpoint does not provide depth maps or normal maps; they only provide meshes and poses.
Depth map can be directly acquired from the MVS tools, or a rasterization is also OK I think I think evaluating normal maps using monocular learning-based methods (like MonoSDF or NeuRIS does) is more precise than the normal from MVS, the latter contains lots of noise
Yeah I agree. Just want to confirm which option was used in real scene experiments in I^2-SDF? Did you use semi-dense MVS depth by feeding images into a MVS pipeline, or rasterized depth/normals with the provided mesh?
W.r.t. to monocular depth (e.g. DPT depth in MonoSDF), I don't see the current code of I^2-SDF supporting scale/shift-invariant depth loss. I will try out DPT depth/normals with bubble loss but I am not sure whether things will just automatically work out if I plug in DPT depth/normals, or changes need to be made to the losses.
I current haven't tested I^2-SDF on monocular depths. All depth maps I used in my experiments are absolute depths. By the way, I will release the real data I used these two days maybe. For monocular depths with bubble loss, one thing I worry about is that the thin structures (e.g. chandeliers) in the monocular depth may only seem visually correct instead of scale correct. With a scale shifting (e.g. via least square method), the projected point cloud from the monocular depth may result in an erroneous area. But after all you can try it first.
Hi, i was confused about the statement that the bubble loss breaks the stable status of the converged SDF field so far. Why can't we merge the bubble step and smooth step into one step?
Hi,
Great work. I have questions regarding equation 9) in the paper. What are the depth/normal supervision ? specifically, what are D(r) and N(r) in 11) and 12) respectively.
1) monosdf uses public models to generate depth maps and normal maps to supervise the model. Just curious how to generate depth/normal supervision for i2sdf. 2) Do i2sdf and monosdf share the same depth/normal supervision ?
Thanks