alejocb / dpptam

DPPTAM: Dense Piecewise Planar Tracking and Mapping from a Monocular Sequence
GNU General Public License v3.0
219 stars 82 forks source link

Inverse Compositional Image Alignment -> Pose Updating step #12

Closed sunghoon031 closed 8 years ago

sunghoon031 commented 8 years ago

In "gauss_newton_ic" function, when you update the current camera pose, I think you perform eq(4) in the paper in the opposite order (T_n = inv(T_hat)*T_n).

I mean line 1681~1688 in SemiDenseTracking.cpp shows that R2 and t2 become:

R2 = R1.t()*R2;
t2 = R1.t()*(t2-t1);

which means they represent the world to camera transformation where:

If I reverse the order and update T_n = T_n * inv(T_hat), I found that it also tracks well for small movements, but it loses tracking for large movements.

Was it that having T_n = inv(T_hat)T_n instead of T_n = T_n \ inv(T_hat) worked better when you tested the code, and is that why it's different from the method described in the paper?

Or did I simply miss something? Please let me know :)

sunghoon031 commented 8 years ago

I finally found an answer to this. @alejocb I figured out you were doing some "illegal" computation when you were updating the camera pose. Although it's close enough to be accepted, it's not mathematically correct.

To explain in a simple way, refer to SVO paper. There they are updating the relative transformation between the keyframe and current frame (keyframe -> current) using the inverse compositional approach.

The difference in your code is that you don't compute this relative transformation, but rather the global transformation wrt the world frame. And R1 & t1 you obtain from GN optimization can be legitimately used to update T(keyframe to current) with T(keyframe to current)*inv(T1) .

This is equivalent to updating T(world to current) with T(keyframe to current)inv(T1)T(world to keyframe). And there you made the "illegal step" by changing the order between the first two and equating it to inv(T1)*T(world to current).

To avoid this illegal move, you should update T(world to current) by computing T(world to current old)inv(T_(world to keyframe))inv(T1)*T(world to keyframe)).

where T(world to current old) = R2 & t2 in code T(world to keyframe) = R_p & t_p in code T1 = R1 & t1 in code

I tested with this transformation, and tracking works well (if not better), but I didn't do any extensive testing yet, so I cannot tell how much (or if) it improved tracking.

sunghoon031 commented 8 years ago

I created a pull request. Check it out @alejocb