Open ootts opened 3 years ago
Hi - thanks for your interest in the project!
Right yes, so the problem with this is that the depth network will be in the same scale as the pose network - some unknown, arbitrary scale.
I'm trying to think of a way to use gt pose to scale the depth estimates, but it isn't immediately obvious.
One way you could do it, would be to ask the depth + pose networks to make predictions as normal, and afterwards scale your depths by the ratio of the predicted translation and the ground truth translation. I can't guarantee that this will give a good result however, but I'd be interested to hear how you get on.
@mdfirman any thoughts?
I tried to use gt_pose and abandon the posenet in monodepth, and the output scale is almost correct(about 0.9*gt_depth), so I assume this will work too for manydepth? What I wonder is how can I finetune with a pretrained model, whose scale is arbitrary, to get the real-world-scale result. In monodepth I scale the groundtruth to the pretrained scale, and scale back when predict. I wonder if there's a better way to do this.
@biggiantpigeon Hi. Have you tried using gt_pose in manydepth? I wonder whether it is feasible? Thank you!
Hi, I have a question about using ground-truth camera poses instead of predicted camera poses. I tried to use camera poses with the correct scale in the KITTI dataset, but I find the scale not correct yet. Is there anything I missed? I only changed the code as follows.
Thanks a lot!