EnoxSoftware / OpenCVForUnity

OpenCV for Unity (Untiy Asset Plugin)
https://assetstore.unity.com/packages/tools/integration/opencv-for-unity-21088
550 stars 172 forks source link

Problems with solvePnP in OpenCVForUnity #169

Closed mattycorbett closed 9 months ago

mattycorbett commented 9 months ago

Using the code below, I am trying to use the output Face Landmarks from Google's MediaPipe to estimate head pose. I am using OpenCV's solvePnP. The code compiles and runs, but the output rotation Quaternions are 90% zeros, with the odd rotation values that are never the same and always inaccurate. There is jitter in the landmarks, but they are normally about a pixel, so I do not believe the error is with the MediaPipe results. I believe the error is with the solvePnP implementation. I am truly at a loss, as even with the jitter, I wouldn't expect Quaternions of zero or so much deviation in the results.

Also, I have tested the example code here in python and it works like a charm. Im using the same landmarks, functions, and conversions. Any ideas?

`Mat rvec = new Mat(1, 3, CvType.CV_64FC1); Mat tvec = new Mat(1, 3, CvType.CV_64FC1); Mat rmat = new Mat(3, 3, CvType.CV_64FC1); //camera_matrix.put(0, 0, 800, 0, 400, 0, 600, 300, 0, 0, 1); camera_matrix.put(0, 0, new double[] { 800, 0, 400, 0, 600, 300, 0, 0, 1.0f });

        dist_coeffs.put(0, 0, 0, 0, 0, 0);

        OpenCVForUnity.CoreModule.Point[] imagePoints = new OpenCVForUnity.CoreModule.Point[6];
        imagePoints[0] = new OpenCVForUnity.CoreModule.Point(landmarks.Landmark[1].X * 800, landmarks.Landmark[1].Y * 600);
        imagePoints[1] = new OpenCVForUnity.CoreModule.Point(landmarks.Landmark[33].X * 800, landmarks.Landmark[33].Y * 600);
        imagePoints[2] = new OpenCVForUnity.CoreModule.Point(landmarks.Landmark[61].X * 800, landmarks.Landmark[61].Y * 600);
        imagePoints[3] = new OpenCVForUnity.CoreModule.Point(landmarks.Landmark[199].X * 800, landmarks.Landmark[199].Y * 600);
        imagePoints[4] = new OpenCVForUnity.CoreModule.Point(landmarks.Landmark[263].X * 800, landmarks.Landmark[263].Y * 600);
        imagePoints[5] = new OpenCVForUnity.CoreModule.Point(landmarks.Landmark[391].X * 800, landmarks.Landmark[391].Y * 600);
        var image_points = new MatOfPoint2f(imagePoints);

        OpenCVForUnity.CoreModule.Point3[] objectPoints = new OpenCVForUnity.CoreModule.Point3[6];
        objectPoints[0] = new OpenCVForUnity.CoreModule.Point3(landmarks.Landmark[1].X * 800, landmarks.Landmark[1].Y * 600, landmarks.Landmark[1].Z * 8000);
        objectPoints[1] = new OpenCVForUnity.CoreModule.Point3(landmarks.Landmark[33].X * 800, landmarks.Landmark[33].Y * 600, landmarks.Landmark[33].Z * 8000);
        objectPoints[2] = new OpenCVForUnity.CoreModule.Point3(landmarks.Landmark[61].X * 800, landmarks.Landmark[61].Y * 600, landmarks.Landmark[61].Z * 8000);
        objectPoints[3] = new OpenCVForUnity.CoreModule.Point3(landmarks.Landmark[199].X * 800, landmarks.Landmark[199].Y * 600, landmarks.Landmark[199].Z * 8000);
        objectPoints[4] = new OpenCVForUnity.CoreModule.Point3(landmarks.Landmark[263].X * 800, landmarks.Landmark[263].Y * 600, landmarks.Landmark[263].Z * 8000);
        objectPoints[5] = new OpenCVForUnity.CoreModule.Point3(landmarks.Landmark[391].X * 800, landmarks.Landmark[391].Y * 600, landmarks.Landmark[391].Z * 8000);
        MatOfPoint3f object_points = new MatOfPoint3f(objectPoints);

        Calib3d.solvePnPRefineVVS(object_points, image_points, camera_matrix, dist_coeffs, rvec, tvec);
        //Debug.Log(rvec.ToString());
        // Convert to unity pose data.
        double[] rvecArr = new double[3];
        rvec.get(0, 0, rvecArr);
        double[] tvecArr = new double[3];
        tvec.get(0, 0, tvecArr);
        PoseData poseData = ARUtils.ConvertRvecTvecToPoseData(rvecArr, tvecArr);
        //Debug.Log(poseData.rot);
        var outQuat = poseData.rot;

        Mat mtxR = new Mat(3, 3, CvType.CV_64FC1);
        Mat mtxQ = new Mat(3, 3, CvType.CV_64FC1);
        Calib3d.Rodrigues(rvec, rmat);

        var angles = Calib3d.RQDecomp3x3(rmat, mtxR, mtxQ);`

An example of the output angles are below image

EnoxSoftware commented 9 months ago

It is unlikely that the OpenCV solvePnP function is broken. First, create a minimal test that processes the exact same values (object_points, image_points...) and compare the results between python and Unity. Also, the Asset "DlibFaceLandmarkDetector" that we sell comes with a sample scene that uses the solvePnP function to estimate head angles from facial landmarks. The code is publicly available and may be helpful in solving the problem. https://github.com/EnoxSoftware/DlibFaceLandmarkDetector/blob/master/Assets/DlibFaceLandmark DetectorWithOpenCVExample/ARHeadExample/ARHeadWebCamTextureExample.cs

mattycorbett commented 9 months ago

Extremely helpful. Thank you!