zhyever / PatchFusion

[CVPR 2024] An End-to-End Tile-Based Framework for High-Resolution Monocular Metric Depth Estimation
https://zhyever.github.io/patchfusion/
MIT License
928 stars 62 forks source link

Ground truth depth #9

Closed mattiasmar closed 3 months ago

mattiasmar commented 7 months ago

How do you suggest that one can create dense ground truth depth for training?

mr-lab commented 7 months ago

unity or unreal depth render of many 3d objects and environments .... or depth camera or a free data set from the net ...

zhyever commented 7 months ago

Yeah, there are free datasets on website. You would also collect data using some rgbd cameras or lidar, but you might need to do some sync or alignment work. Another way is to use synthesis data of course.

mattiasmar commented 7 months ago

Many RGBD cameras and all lidars create sparse depth. In the case of RGBD cameras there are typically regions, whole surfaces and edges of objects with missing/invalid depth. In order to train your system on a domain specific data, how do you suggest to cope with these missing data points?

zhyever commented 7 months ago

For me, this would be an open issue. A straightforward way would use some depth complement approaches to fill the holes at first. If we treat depth maps as a kind of images, using recent strategies in the field of generative models would also be possible to fill the holes.

mattiasmar commented 7 months ago

How about in the costs function disregarding pixels where depth is not reliable; Would that be a stable option?

mr-lab commented 7 months ago

How about in the costs function disregarding pixels where depth is not reliable; Would that be a stable option?

you can do a grayscaleimage check on the depth , if for example Mathf.Abs(depth - grayscaleimage ) >DifferenceThreshold or you can use any open source depth estimation model to generate depth and use that as a ground truth to fix the holes in your depth by simply detecting the Difference between generated depth and RGBD depth results then replace the missing part in your RGBD depth results
finally you can use PatchFusion to patch your RGBD depth results , by feeding it in iter_pred= your RGBD depth results but you will have to convert your data to the same resolution and array size or tensor size , i never used RGBD camera so excuse my ignorance in advance . cheers