Open JUGGHM opened 1 month ago
Hi, thanks for your interest in our work! The direction you mentioned sounds exciting! Actually, we were considering the extension using 3D foundation models like Croco or Metric3D, though we haven't decided on any concrete ideas yet.
The pretrained model we currently offer works quite well even with fairly noisy depth data, though we haven't confirmed how much it can handle. However, the model we've trained so far has learned to simulate a low-level action like 'depth-based warping' to some extent, so if the given condition is highly incorrect, it might follow that incorrect condition to warp accordingly — but I think we can address this if we take it into account from the training stage.
Rather than relying on depth maps explicitly, I'm optimistic about the possibility of leveraging the representation from a 3D foundation model like Croco to enable geometric-free warping.
Authors, Thank you for this great work! I wonder whether the model works when the point maps or the metric depth maps are highly incorrect. I suspect so because the full framework seems built on roughly satisfactory warped points. For instance, CrocoV2 fully relies on semantic clues and can benefit downstream applications like dust3r.