mpatacchiola / deepgaze

Computer Vision library for human-computer interaction. It implements Head Pose and Gaze Direction Estimation Using Convolutional Neural Networks, Skin Detection through Backprojection, Motion Detection and Tracking, Saliency Map.
MIT License
1.79k stars 478 forks source link

About head-pose-estimation by Python implementaiton #40

Closed jerryhouuu closed 6 years ago

jerryhouuu commented 6 years ago

Hello I follow your code(ex_dlib_pnp_head_pose_estimation_video.py) to implementation the pose estimation, I want to get the face pose, range from +90 to -90, like the following picture image001 I use the six landmark and their world coordinate to get pose

image_points = np.array([
                            (landmarks[4], landmarks[5]),     # Nose tip
                            (landmarks[10], landmarks[11]),   # Chin
                            (landmarks[0], landmarks[1]),     # Left eye left corner
                            (landmarks[2], landmarks[3]),     # Right eye right corne
                            (landmarks[6], landmarks[7]),     # Left Mouth corner
                            (landmarks[8], landmarks[9])      # Right mouth corner
                        ], dtype="double")

    # 3D model points.
    model_points = np.array([
                            (0.0, 0.0, 0.0),             # Nose tip
                            (0.0, -330.0, -65.0),        # Chin
                            (-165.0, 170.0, -135.0),     # Left eye left corner
                            (165.0, 170.0, -135.0),      # Right eye right corne
                            (-150.0, -150.0, -125.0),    # Left Mouth corner
                            (150.0, -150.0, -125.0)      # Right mouth corner                         
                        ])
(success, rotation_vector, translation_vector) = cv2.solvePnP(model_points, image_points, camera_matrix, dist_coeffs, flags=cv2.CV_ITERATIVE)
rvec_matrix = cv2.Rodrigues(rotation_vector)[0]
proj_matrix = np.hstack((rvec_matrix, translation_vector))
eulerAngles = -cv2.decomposeProjectionMatrix(proj_matrix)[6] 
yaw   = eulerAngles[1]
pitch = eulerAngles[0]
roll  = eulerAngles[2]
if pitch > 0:
  pitch = 180 - pitch
elif pitch < 0:
  pitch = -180 - pitch
yaw = -yaw 

But I have some problem, each case(yaw, pitch, row) is correct, for example, In face roll case, roll is true, but, pitch is false. Or, in complex situation, one of three will false. Could u give me some advise? Thanks.

mpatacchiola commented 6 years ago

Hi @jerryhouuu

Without any print of the returned values it is hard to find the issue. To debug the code you should pass known parameters and see if the returned value is the expected one.

Moreover you can create your own method for the decomposition and pass the projection matrix to it. A nice example is given here, the method is the following:

def rotationMatrixToEulerAngles(proj_matrix):
    sy = math.sqrt(R[0,0] * R[0,0] +  R[1,0] * R[1,0])
    singular = sy < 1e-6
    if  not singular :
        x = math.atan2(R[2,1] , R[2,2])
        y = math.atan2(-R[2,0], sy)
        z = math.atan2(R[1,0], R[0,0])
    else :
        x = math.atan2(-R[1,2], R[1,1])
        y = math.atan2(-R[2,0], sy)
        z = 0
    return np.array([x, y, z])