SysCV / r3d3

BSD 3-Clause "New" or "Revised" License
144 stars 13 forks source link

Test in a new scene #3

Open JOP-Lee opened 11 months ago

JOP-Lee commented 11 months ago

Hello, I want to know whether the pre-trained model can be used to estimate the absolute depth map in a new scene, such as inputting an rgb image or a video sequence. If so, how can the scale information of multiple depth maps estimated by the pre-trained model be obtained? I want to splice multiple depth maps into a point cloud, as your video demo shows. Do you have any suggestions? I would appreciate it very much.

AronDiSc commented 11 months ago

Hi @JOP-Lee. All predictions made by our evaluation pipeline are in metric scale. Metric scale is recovered through matched features in the overlapping regions of the calibrated multi-camera system (see Sec. 3.2. in the paper). Furthermore, the prior of the completion network is also in metric scale because it was trained with metric scale poses resulting from the former. Thus, you can combine the resulting depth maps into a point cloud as follows ...

  1. Run evaluate.py and save the predicted depth maps and poses
  2. Unproject depth maps by using the respective camera intrinsics
  3. Transform points from the camera reference frame to the world reference frame with the estimated poses
  4. (Optional) Filter outliers
JOP-Lee commented 11 months ago

@AronDiSc Thank you for your response. Can you provide a script for testing a single image or multiple camera new scenes? It seems that evaluate.py is used for evaluating the DDAD and nuScenes datasets, and it requires masks and poses. However, for beginners testing new scenes, it would be very convenient to synthesize depth maps from single images (similar to monodepth2).