Training with ground-truth poses rather than PoseExpNet

an-kumar commented 5 years ago

I tried modifying the code to take the ground truth pose as input and use that for the warp rather than learning a PoseNet. I expected this to help convergence since the depth network would not be getting trained on bad poses in early stages. However, it ended up being considerably worse than using PoseExpNet.

I'm still looking into it to make sure I didn't do something wrong, but I was wondering if you ever tried this or have any thoughts. Perhaps the "ground-truth" poses are not that accurate, so learning the pose function ends up being more precise? If that's the case, than the odometry benchmarks are somewhat meaningless..

ClementPinard commented 5 years ago

Amen to that :)

GPS rtk is still sometimes very noisy, and its main interest is for long range odemetry evaluation. A good hint about that is the official kitti odometry benchmark which dropped evaluation on sequences that are too short :

http://www.cvlibs.net/datasets/kitti/eval_odometry.php

On 03.10.2013 we have changed the evaluated sequence lengths from (5,10,50,100,...,400) to (100,200,...,800) due to the fact that the GPS/OXTS ground truth error for very small sub-sequences was large and hence biased the evaluation results. Now the averages below take into account longer sequences and provide a better indication of the true performance.

However here are some points that might help you :

Raw GPS coordinates are noisy, but poses from kitti odometry are somewhat cleaned and have a better quality. You can try to train on kitti odometry training sequences, but that's a smaller dataset.
Poses can be deduced from GPS but also from estimated speed! It's probably better to integrate the speed to have the displacement since the sequence is very short. There is a snippet available on the kitti testkit to kinda do that :

https://github.com/ClementPinard/SfmLearner-Pytorch/blob/master/kitti_eval/depth_evaluation_utils.py#L89

an-kumar commented 5 years ago

Ok I’ll try that.

To confirm, though, the poses provided in the poses.txt file when you prepare the data with —with-pose are given as camera to world transformations, not world to camera, right?

ClementPinard commented 5 years ago

It's the position of the camera with respect to first position of camera, just like in odometry dataset So it's a camera to world transformation, with world origin set a first position of camera in the sequence

sconlyshootery commented 5 years ago

HI, your code is very ncie, I want to figure out how to get relative pose from speed detected by IMU?

ClementPinard commented 4 years ago

See here how I did it : https://github.com/ClementPinard/SfmLearner-Pytorch/blob/master/kitti_eval/depth_evaluation_utils.py#L89

ClementPinard / SfmLearner-Pytorch

Training with ground-truth poses rather than PoseExpNet #69