nianticlabs / monodepth2

[ICCV 2019] Monocular depth estimation from a single image
Other
4.1k stars 952 forks source link

PoseCNN: Replacing axisangle with Quaternion representation #209

Closed apollospace closed 4 years ago

apollospace commented 4 years ago

Hello,

I am building a custom dataset using the ZED2 camera. I would also like to replace the PoseCNN with the pose that the ZED API provides. The API provides a quaternion [X, Y, Z, W] for orientation and a vector [x, Y, Z] for translation/position wrt to the previous camera frame.

From the past discussions #17 #204 #128 and after reading the codebase, I can see that the poseCNN outputs an axisangle for rotation and a translation.

If I were to replace the PoseCNN, my questions are:

1) Will it be enough to convert from a quaternion representation to axisangle and drop it as an inplace replacement?

2) If I understand the inversion in #17 , I will still have to perform the inversion for T_1 -> T_0?

Many thanks

mrharicot commented 4 years ago

Hi, If I understand correctly you want to remove the poseCNN altogether and use the poses from the ZED camera to perform the reprojection? If so you can simply compute the proper rigid transformation matrices in the dataloader and treat them like inputs.

  1. It should be. You need to make sure of which way the transform is given, and which quaternion representation they are using.
  2. If you treat these transforms as inputs you simply need to compute the correct one in the dataloader, no need to invert in the trainer.

I hope this helps!

apollospace commented 4 years ago

Okay. Thanks for confirming that. If I may ask further, would this be the right way to convert from a quaternion to axisangle?: https://www.euclideanspace.com/maths/geometry/rotations/conversions/quaternionToAngle/index.htm

Also, the API uses a right handed Y down coordinate system by default. Other available systems are:

My intuition says that I must instead use right handed Y-up. Would this be the right coordinate system to use in place of the PoseCNN?

Many thanks

mrharicot commented 4 years ago

We use the standard computer vision coordinate system for cameras:

For handling rotations I would recommend using scipy: (although I have never tried it myself) https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.transform.Rotation.html You can create a Rotation using from_quat and get the axis angle with as_rotvec

I hope this helps!

apollospace commented 4 years ago

Thank you very much for the info and recommendations. I seem to be getting the information in the correct format. Cheers!

apollospace commented 4 years ago

Hello again @mrharicot

I tried training a network using the IMU information from the ZED, but am not getting the right depth predictions.

Thanks for your help