shichaoy / pop_up_slam

Pop-up SLAM: Semantic Monocular Plane SLAM for Low-texture Environments
Other
226 stars 68 forks source link

How to compute the plane equation in the camera coordinate? #15

Open Beniko95J opened 5 years ago

Beniko95J commented 5 years ago

Hi, I am reading your paper "Pop-up SLAM: Semantic Monocular Plane SLAM for Low-texture Environments" and I find it hard to understand how you compute the plane equation in the camera coordinate. I don't know how to estimate the initial camera pose in the world coordinate to get Tw,c, then using the equation (6) to get pi_c. There is a section that you start with "Then we show how to compute the plane equation pi_c", while I think it is explaining how to compute the plane equation of the wall.

Would you please tell me how to understand this?

Best regards

shichaoy commented 4 years ago

hi In section III-C (1), given the camera pose wrt to the ground world frame and a wall-ground line detected in the image, we can back-project two points on the line onto the 3D ground plane in Eq (7). From two 3D points, we can compute 3D vertical wall plane equation. Not sure whether you are asking Eq (7), you first compute a ray K^(-1)p, then parameterize it to hit the ground plane (z=0)

However it requires to have an estimated T_w_c. You can detect three vanishing points to estimate, or from SLAM, or just assume it parallel to ground.

Beniko95J commented 4 years ago

Hi, thank you for the reply.

Yeah my question is about how to get an estimated T_w_c. Your world frame is not arbitrary but built on the ground plane so that the ground plane can be represented by (0, 0, 1, 0)^T. So I am wondering how to get the estimated T_w_c in this specific coordinate. Even in a visual-inertial SLAM, we can only initialize the first pose in an coordinate parallel to the ground plane (thanks to the gravity in the accelerator, we can observe the pitch and roll angle) so that the ground plane may be represented by (0, 0, 1, a) where a is an arbitrary value that we don't know. So I am not sure whether I can initialize the first pose in this specific coordinate in a visual-SLAM system. Even I assume the estimated T_w_c is parallel to ground, I still cannot determine its height (the z axis). Would you please explain how to get estimated T_w_c in this specific frame with more details? I will really appreciate that.

Thank you very much.

shichaoy commented 4 years ago

hi I agree with what you discussed. it's impossible to get height just from monocular camera, unless there is other assumption like some object height/size. so in my case I just roughly estimate it. or from prior information, like ground truth initial camera height, car height on kitti, etc.

the same problem happens to my latter object slam. at that time, I sometimes get the scale from known object dimension.