Open WuJuli opened 3 months ago
The self.model
will output relative depth in that snippet as the depth_anything_vits14.pth
checkpoint of DepthAnything is not a metric mono depth estimator. So the authors have used the metric depth from ZoeDepth to rescale the relative depths predicted by DepthAnything as to try to make it metric.
DepthAnything also has fine-tuned metric checkpoints here in case you want to try some of those out and see if it works better. I wouldn't know what the currently best way to get metric mono depth for videos is? From what I know, it is still an open research problem.
Why using the 3d part from zoedepth_nk?