Camera intrinsic and extrinsic matrices

cgxuvector commented 3 years ago

Hi, I am doing some research work where I have to do the inverse perspective projection of the front view depth image. In particular, given a front-view depth image I, I am trying to use it to reconstruct the 3D positions of each pixel on the image. A common solution is to use the camera intrinsic matrix and extrinsic matrix.

Could you please give me any suggestions about where I can find those matrices or how to do the projection in a more simple way?

Best,

maximecb commented 3 years ago

Hello @cgxuvector.

The camera matrices are set up here: https://github.com/maximecb/gym-miniworld/blob/master/gym_miniworld/miniworld.py#L1180

You can potentially ask OpenGL to give you the perspective and projection matrices and invert them using numpy. I don't exactly know how to do that, but this can get you started if you want to go that route: https://stackoverflow.com/questions/9849374/using-glgetfloatv-to-retrieve-the-modelview-matrix-in-pyglet

Another possibility is that I added some methods on the agent object to give you the position of the agent and the camera's direction vector in 3D space (this is the self.agent object in the env): https://github.com/maximecb/gym-miniworld/blob/master/gym_miniworld/entity.py#L458

There's also the camera vertical field of view angle in degrees: https://github.com/maximecb/gym-miniworld/blob/master/gym_miniworld/entity.py#L448

So with a little bit of trigonometry and vector arithmetic, you can compute the position of the corners of the image some distance away from the camera, or you can compute a unit vector going through any pixel in the image.

cgxuvector commented 3 years ago

Thanks, it really helps. I will close the question.

Farama-Foundation / Miniworld

Camera intrinsic and extrinsic matrices #45