Open TwiceMao opened 5 months ago
I normalized the first frame of GT_poses.txt. To be specific, I extracted the poses of the first frame from GT_poses.txt, and then for each frame, I multiplied its poses with those of the first frame to obtain GT_poses_setorigin.txt.
If I understand correctly, your goal was to let GT_poses
start at the origin. This would mean that the trajectory was transformed rigidly to the origin, without changing its shape.
But when I look at GT_poses
vs GT_poses_setorigin
, they have completely different shapes. This can be seen for example by plotting, comparing their path length (8cm vs 24cm), or by calculating their difference with a metric (as you also noticed, this should be zero but it's not).
So I suspect that your transformation step introduced this error and messed up your "setorigin" groundtruth trajectory. This can happen e.g. if you're using pose matrices and do the matrix multiplication in the wrong order (P*T
vs T*P
), but here you need to check the math of your code yourself.
the distances between the estimated poses and the ground truth poses, before and after rigid body transformation, are different
Not sure what the issue is here, this is the expected behavior. The alignment minimizes the distance. So unless you have already optimally aligned data, the distance will change compared to the unaligned case.
Running a metric to compare estimated_poses
vs GT_poses
gives a normal result (like "1.)" in your examples). They are similar and the alignment looks fine.
GT_poses_setorigin
doesn't work well, but that's rather because of the general issues of that trajectory mentioned before, not because of alignment or the metric.
@MichaelGrupp Thanks!Yes, consistent with what you mentioned. I made a mistake in obtaining GT_poses_setorigin. Specifically, I should perform the matrix multiplication P*T
, but I mistakenly did T*P
, where T represents GT_poses, and P represents a rigid transformation. After executing the correct matrix multiplication, the performance of GT_poses and GT_poses_setorigin became reasonable.
I also have follow-up questions:
(1) Do you think it's reasonable to calculate the ATE for the rotational part of camera poses?
(2) Do you think it's reasonable to align camera poses using Umeyama's method first and then calculate the ATE for the rotated part of the aligned camera poses?
(3) Is it only reasonable to calculate RPE for the rotational part of camera poses?
(4) How is your --align_origin
implemented?
(5) For GT_poses
andestimated_poses
, their first camera poses are different, so to compute their distance, I first transform the camera pose of the first frame in GT_poses
into the identity matrix through a rigid transformation applied to all camera poses inGT_poses
. A similar operation is performed on estimated_poses
to ensure that its first frame is also the identity matrix. Then, I calculate the ATE for both the translation and rotation parts of GT_poses and estimated_poses after these operations. Do you think this approach is reasonable?
(6) Is there any other alignment method that can align the rotational part of camera poses?
The reason for my three doubts is that in your EVO's-a -as
options, Umeyama's method is adopted.
However, Umeyama's method takes only the translational part of camera poses as input, not the rotational part. Nevertheless, its output is a similarity transformation. Moreover, this similarity transformation needs to be applied to each camera pose, causing a change in the rotation of each camera pose.
And, I have an example: there are two poses in tum format: tum_1.txt, tum_2.txt. after aligning two sets of poses using Umeyama's method, the value of the ATE for the rotation part actually increased.
(1) Before alignment, the distance for the rotation part was:
evo_ape tum tum_1.txt tum_2.txt -v --pose_relation angle_deg
--------------------------------------------------------------------------------
Compared 30 absolute pose pairs.
Calculating APE for rotation angle in degrees pose relation...
--------------------------------------------------------------------------------
APE w.r.t. rotation angle in degrees (deg)
(not aligned)
max 0.754146
mean 0.664145
median 0.669600
min 0.601968
rmse 0.665402
sse 13.282815
std 0.040897
(2) After alignment, the distance for the rotation part became:
evo_ape tum tum_1.txt tum_2.txt -v -as --pose_relation angle_deg
..with max. time diff.: 0.01 (s) and time offset: 0.0 (s).
--------------------------------------------------------------------------------
Aligning using Umeyama's method...
Rotation of alignment:
[[ 0.51250773 0.66465388 0.54366446]
[-0.48380259 0.74658902 -0.45666168]
[-0.70941588 -0.02898363 0.70419391]]
Translation of alignment:
[-0.00581512 0.00968778 -0.01290893]
Scale correction: 1.0
--------------------------------------------------------------------------------
Compared 30 absolute pose pairs.
Calculating APE for rotation angle in degrees pose relation...
--------------------------------------------------------------------------------
APE w.r.t. rotation angle in degrees (deg)
(with SE(3) Umeyama alignment)
max 61.279638
mean 61.129781
median 61.180745
min 60.900483
rmse 61.129905
sse 112105.959311
std 0.123085
@MichaelGrupp hello,could you help me with the last questions? Big thanks~~~~
@MichaelGrupp
Description: According to my understanding of Umeyama's method, it first aligns two sets of poses through similarity transformations and then I can calculate their distances. When using your Umeyama's method... (with scale correction), I encountered a peculiar issue. Simply put, the distances between the estimated poses and the ground truth poses, before and after rigid body transformation, are different. Specifically, All the camera poses I use are transformations from the camera to the world coordinate system,i.e. poses I used: camera to world(c2w). I suspect whether it is necessary to use 'world to camera' as the pose instead of 'camera to world'.I have a set of ground truth poses named GT_poses.txt and another set estimated through some method named estimated_poses.txt. Additionally, I normalized the first frame of GT_poses.txt. To be specific, I extracted the poses of the first frame from GT_poses.txt, and then for each frame, I multiplied its poses with those of the first frame to obtain GT_poses_setorigin.txt. I observed significant differences in distances, both in translation and rotation, between estimated_poses.txt and both GT_poses.txt and GT_poses_setorigin.txt. Finally, I tested the distance between GT_poses.txt and GT_poses_setorigin.txt, and it was also large. I cannot comprehend this phenomenon because performing normalization essentially involves a rigid body transformation, which should be solvable by Umeyama's method... (with scale correction). This issue did not occur when I used evo's Umeyama's method... (with scale correction) previously. This occurrence leaves me puzzled. At least the translation distance between GT_poses.txt and GT_poses_setorigin.txt should be close to 0, right? My commands and outputs are as follows:
Command:
1.
2.
3.
Additional files: Please attach all the files needed to reproduce the error.
Please give also the following information:
evo pkg --version
:evo pkg --pyversion
:evo_config show --brief --no_color
:Ubuntu 20.04 evo version v1.16.0 evo pkg --pyversion 3.7.11
Download lineks of input files I used above tum_GT_poses.txt tum_estimated_poses.txt tum_GT_poses_setorigin.txt