gaoxiang12 / slambook-en

The English version of 14 lectures on visual SLAM.
GNU General Public License v3.0
1.41k stars 255 forks source link

ch11 Question regarding the depth symbols used in Section 11.3.4 #57

Closed SelfStudyM closed 2 years ago

SelfStudyM commented 2 years ago

Dear authors,

Regarding Sec. 11.3.4 Pre-transform the Image, a question is:

In Eq. (11.12) - (11.14), is it more appropriate to denote the 'depth' symbols as z_{R} and z_{C} rather than d_{R} and d_{C}?

According to the symbol convention used in Sec. 11.2.3: d represents the length of the vector that connects the optical center, O1, and the 3D landmark point (i.e., d is the norm of vector O1P in Figure 11-4). As stated at the end of Sec. 11.2.3, "Please note that the depth value mentioned here is the length of O1P, which is slightly different from the depth we mentioned in the pinhole camera model. The depth in a pinhole camera refers to the z value of the pixel."

For Eq. (11.12) and (11.13), these two equations are derived from the pinhole camera model. So it seems to be more adequate to use z here to let both sides of the equations represent the [x, y, z] coordinates of the 3D landmark under the camera frame.

The reason to have this question is that when following the advice in Sec. 11.3.4 to improve the Monocular Dense Reconstruction, if the above question is valid, then we would need to compute and use z_{R} and z_{C}, instead of directly using d_{R} (i.e., depth_mu in the code demo) for constructing the Affine matrix in Eq. (11.15).

I hope the above declaration is clear, and I hope you can kindly advise if my understanding is correct.

Thanks!

gaoxiang12 commented 2 years ago

Hi @SelfStudyM, Thanks for your advice. It's true that using a different notation here would be better. I'll consider changing these equations, but it may take some time since the book is officially published in springer now.

SelfStudyM commented 2 years ago

Thank you so much for your reply, Dr. Gao. I sincerely appreciate your work. I highly enjoy studying the book and I have finished reading the material.

Some personal feelings after reading the book: I have studied state estimation and computer vision for robotics before at university, and I found this book did fill in some of my knowledge gaps, especially the contents about the SLAM loop closure and graph construction. In my personal opinion, a one-semester state estimation course at university often only be able to cover up to Ch 9, Backend II, without losing the important details of Lie Groups, Nonlinear Optimization, Bundle Adjustment, KF & EKF, and VO Pipeline, etc. This book integrates all the key components together (computer vision, state estimation, explanation of commonly used packages and example codes) to make it convenient for students to start learning SLAM. I will definitely recommend this book to people who would like to study this topic, and I would like to say thank you for the solid work!