MichaelGrupp / evo

Python package for the evaluation of odometry and SLAM
https://michaelgrupp.github.io/evo/
GNU General Public License v3.0
3.32k stars 743 forks source link

Unexpected APE values for trajectory with tracking failure #673

Closed SerseusWasTaken closed 3 weeks ago

SerseusWasTaken commented 3 weeks ago

Hello, thank you for the nice software! I'm using evo to calculate the APE for two estimated trajectories for the KITTI 00 sequence, one from ORB-SLAM3 and the other one from LIFT-SLAM. However, the results are not what I expected. LIFT-SLAM lost track during the majority of the sequence but its APE is smaller than the one for ORB-SLAM3.

Trajectory of LIFT-SLAM: plot_trajectories
File: LiftSlam.txt APE:
max mean median min rmse sse std
4.8021989475719 1.1087224511900344 0.8095152930675219 0.3506396104194617 1.3689082221280797 511.5773537264919 0.8028974074170514
Trajectory of ORB-SLAM3: plot_trajectories
File: OrbSlam3.txt APE:
max mean median min rmse sse std
18.231462576346605 8.404354694993438 7.291069728415576 0.8180358209648926 9.516925333008539 189838.6348963491 4.4652760222411185

The command i used to calculate the APE: evo_ape tum 00_tum.txt CudaOrbSlam3.txt -r full -va -as -p --plot_mode xyz --save_plot APE_plot --save_results ./APE.zip

I would've expected that the APE for LIFT-SLAM would be higher here, compared to the one by ORB-SLAM3, since its trajectory is completely off. Am I missing something here, or is that expected? Thank you for your help!

Here is the ground truth file i used (its the KITTI 00 ground truth formatted to TUM format using your script): 00_tum.txt

MichaelGrupp commented 3 weeks ago

That trajectory is missing a lot of data, but in those sections when it has data, it seems to fit well to the trajectory. The metric calculation does not take into account if a trajectory has less data, it computes the metric for the part of the data that can be matched with the groundtruth.

So you have to take this into account when comparing two results, e.g. by doing a visual check of the plots etc.

A more fair comparison of the two trajectories here would be computing the metric only for time ranges in which both have data, by choosing --t_start <start time> --t_end <end time>.

(or simply consider that this high amount of lost tracking is a general failure, independent of the metric)

SerseusWasTaken commented 3 weeks ago

Okay I see, that makes sense. I thought the APE was always calculated with respect to the full ground truth trajectory, which is where my confusion came from.
And yeah, I would definitely consider that high amount of tracking failure a general failure.

Thanks a lot! That cleared my confusion :)