fengju514 / Face-Pose-Net

Estimate 3D face pose (6DoF) or 11 parameters of 3x4 projection matrix by a Convolutional Neural Network
501 stars 109 forks source link

Query about bounding box specifications and conventions #19

Closed ssundar6087 closed 6 years ago

ssundar6087 commented 6 years ago

Hello,

Does the result returned by the pre-trained model depend on how tight the bounding box is to the face? Does it also depend on the aspect ratio of the box ? I'm seeing an almost frontal face with very little tilt register a large roll ~60 degrees.

As for the results generated by the pre-trained model, what are the sign conventions of the pitch, yaw and roll angles? For example, if a person's face is turned to the right, is that considered +ve or is that -ve? Similarly, what are the conventions for looking up, looking down, tilted right, tilted left?

Thanks!

fengju514 commented 6 years ago

The bounding box size/type would affect the estimated 3D head poses to some extent. We use the face detector from Z. Yang and R. Nevatia. A multi-scale cascade fully convolutional network face detector. In ICPR, pages 633–638, 2016. and expand it by 25%.

Regarding to the sign conventions of the pitch, yaw and toll, a general rule is counterclockwise rotations represent the positive rotations. So, according to this rule and from your view:

turn left (negative), turn right (positive) looking up (negative), looking down (positive) tilt right (negative), tilt left (positive)