vladmandic / face-api

FaceAPI: AI-powered Face Detection & Rotation Tracking, Face Description & Recognition, Age & Gender & Emotion Prediction for Browser and NodeJS using TensorFlow/JS
https://vladmandic.github.io/face-api/demo/webcam.html
MIT License
827 stars 149 forks source link

Implement head position angle of (yaw, roll, pitch) #40

Closed tianyingchun closed 3 years ago

tianyingchun commented 3 years ago

HI do you have plan to implement Head position angle of (Yaw, Roll, Pitch)

pose: {
  pitch_angle: {value: 11.102898}
  roll_angle: {value: -20.291693}
  yaw_angle: {value: 14.172521}
}

like face-api discuss here https://github.com/justadudewhohacks/face-api.js/issues/107

vladmandic commented 3 years ago

good suggestion, i'll add it

btw, it's much simpler in my newer library https://github.com/vladmandic/human since the face model is much more detailed and also provides full 3D information, so calculating angles is a relatively simple math

vladmandic commented 3 years ago

I've just added 3d face angle calculation to Human library via https://github.com/vladmandic/human/issues/83 Results are in radians and quite precise (they can also be easily converted to degrees)

"face": [
  {
    ...
    "angle": {
        "roll": -0.04158832296882946,
        "yaw": -0.030124894638440182,
        "pitch": 0.07136356749896162
    }
  }
],

I'll do something for FaceAPI as well, but approximations due to missing Z-axis are not going to make it precise
(except for roll as that can be done with X,Y only)

tianyingchun commented 3 years ago

:) I was very excited to hear the news, but When will it be integrated into FaceAPI?

vladmandic commented 3 years ago

i've added function to my port of face-api https://github.com/vladmandic/face-api
as requested in https://github.com/vladmandic/face-api/issues/40
that extends .withFaceLandmarks() mehod to calculate angle values
and returns additional property in the resultset:

angle: {
  pitch: 0.05355965542754132
  roll: 0.045894150362739826
  yaw: -0.19656298447653597
}

the issue is that face-api landmarks are 2d,
so there is quite a lot of missing to properly calculate angles,
this is more of a best-guess than anything:

function calculateFaceAngle(mesh) {
  const radians = (a1, a2, b1, b2) => Math.atan2(b2 - a2, b1 - a1);

  const angle = { roll: <number | undefined>undefined, pitch: <number | undefined>undefined, yaw: <number | undefined>undefined };

  if (!mesh || !mesh._positions || mesh._positions.length !== 68) return angle;
  const pt = mesh._positions;

  // roll is face lean left/right
  // comparing x,y of outside corners of leftEye and rightEye
  angle.roll = radians(pt[36]._x, pt[36]._y, pt[45]._x, pt[45]._y);

  // yaw is face turn left/right
  // comparing x distance of bottom of nose to left and right edge of face
  //       and y distance of top    of nose to left and right edge of face
  // precision is lacking since coordinates are not precise enough
  angle.pitch = radians(pt[30]._x - pt[0]._x, pt[27]._y - pt[0]._y, pt[16]._x - pt[30]._x, pt[27]._y - pt[16]._y);

  // pitch is face move up/down
  // comparing size of the box around the face with top and bottom of detected landmarks
  // silly hack, but this gives us face compression on y-axis
  // e.g., tilting head up hides the forehead that doesn't have any landmarks so ratio drops
  // value is normalized to range, but is not in actual radians
  const bottom = pt.reduce((prev, cur) => (prev < cur._y ? prev : cur._y), +Infinity);
  const top = pt.reduce((prev, cur) => (prev > cur._y ? prev : cur._y), -Infinity);
  angle.yaw = 10 * (mesh._imgDims._height / (top - bottom) / 1.45 - 1);

  return angle;
}

if anyone has better suggestions, please tell me! :)

and i don't see how replacing solvePnP would improve things since model itself returns 2D landmarks,
so there is nothing to solve for to start with

now, i've also added angle calculations in my library: Human https://github.com/vladmandic/human
and since the facemesh there is far more detailed (478 points vs 58 points)
and results are in 3D instead of 2D,
it calculates actual angles with decent precision:

calculateFaceAngle = (mesh) => {
    if (!mesh) return {};
    const radians = (a1, a2, b1, b2) => Math.atan2(b2 - a2, b1 - a1);
    const angle = {
      // roll is face lean left/right
      // looking at x,y of outside corners of leftEye and rightEye
      roll: radians(mesh[33][0], mesh[33][1], mesh[263][0], mesh[263][1]),
      // yaw is face turn left/right
      // looking at x,z of outside corners of leftEye and rightEye
      yaw: radians(mesh[33][0], mesh[33][2], mesh[263][0], mesh[263][2]),
      // pitch is face move up/down
      // looking at y,z of top and bottom points of the face
      pitch: radians(mesh[10][1], mesh[10][2], mesh[152][1], mesh[152][2]),
    };
    return angle;
  }
vladmandic commented 3 years ago

new version has been published, closing this issue.

tianyingchun commented 3 years ago

The angle is radians? can you add ts typings support angle for withFaceLandmarks<TSource> for property angle expose , it seem that now it can only be landmarks, unshiftedLandmarks, alignedRect maybe should add angle in WithFaceLandmarks.d.ts.

vladmandic commented 3 years ago

roll & pitch are in radians
yaw is a generic normalized value as entire yaw calculation is an approximation at best since there is no z-axis in the model

i've just added property definition.

tianyingchun commented 3 years ago

you means yaw i can use the value as Angle ?

tianyingchun commented 3 years ago

and BTW can you help clarify the direction for yaw, roll, pitch // Roll: -x to x (0 is frontal, positive is clock-wise, negative is anti-clock-wise) // Yaw: -x to x (0 is frontal, positive is looking right, negative is looking left) // Pitch: 0 to 4 (0 is looking upward, 1 is looking straight, >1 is looking downward) it's right?

vladmandic commented 3 years ago

i've updated the algorithm to return consistent values and published new version

all values are in radians in range of -pi/2 to pi/2 which is -90 to +90 degrees
value of 0 means center in all cases
note that values outside of +/- pi/6 range are not that reliable as mesh detection does not work reliably with angles higher than 30 degrees

accuracy wise, roll is most accurate, pitch less so and yaw is really just an approximation

no-1ne commented 3 years ago

The distance between 2 points vertically and 2 points horizontally, can help improve the approximation

vladmandic commented 3 years ago

@startupgurukul can you elaborate? i'm already relying on distance.

ButzYung commented 3 years ago

@vladmandic The method used in calculateFaceAngle to calculate the rotations is simple, but probably not accurate enough in some cases. It works when rotation is subtle, but consider the extreme case, like when you turn your head 90 dgress left or right (yaw), you will be unable to get the pitch rotation because the z coordinates for those points are always 0.

My app needs to do that kind of calculations and so I have some experiences. I have tried various methods, and at the end I choose the following one, which seems to be the simplest and it works for all kinds of rotations (I even use this to calculate the joint rotation for hand pose).

http://renderdan.blogspot.com/2006/05/rotation-matrix-from-axis-vectors.html

The concenpt is simple. You use the mesh coordinates to construct the x and y axis of the face (basically equivalent to the coordinates needed for roll/yaw/pitch). Then you can get the z axis by crossing x and y axis. With all 3 axises (normalized), you can construct the rotation matrix.

vladmandic commented 3 years ago

@ButzYung interesting approach - i'll give it a try. although not sure on benefits if mesh coordinates in face-api are not very precise to start with. and in human, we already have a z-axis. i could expand human solution to use this approach if z-axis is not known because face rotation causes points to be hidden?

ButzYung commented 3 years ago

@vladmandic This new method probably won't improve much in face-api, but for human the benefit can be obvious, since we have xy and z coordinates. You don't really need the z-axis directly (the face mesh does not provide enough points to get it directly anyways), but since you can always get the x and y axis (basically just the top/bottom/left/right points of the face outline), you can cross x and y axis to get the z one.

vladmandic commented 3 years ago

@ButzYung will try

vladmandic commented 3 years ago

@ButzYung just an idea - perhaps you can contribute a PR?

ButzYung commented 3 years ago

@vladmandic OK I will give it a try. I think I should contribute more since I am using your library lol

vladmandic commented 3 years ago

it's always welcome! :) take a look at contributing guide (veeeery short, but primarily to run linting to enforce code style): https://github.com/vladmandic/human/blob/main/CONTRIBUTING