asus4 / tf-lite-unity-sample

TensorFlow Lite Samples on Unity
871 stars 254 forks source link

face blendshape and geometry support #310

Closed se7enXF closed 1 year ago

se7enXF commented 1 year ago

What I have done

Follow the FaceMesh sample, I add a new model face_blendshapes.tflile after faceMesh detection. It works but not good. The model can be found here. I think it is because faceMesh model is not good enough. Here is my code:

using System.Collections.Generic;
using System.Linq;
using UnityEngine;

namespace TensorFlowLite
{
    /// <summary>
    /// Predicts face blendshapes from landmarks.Reference:
    /// https://github.com/google/mediapipe/blob/ef4a8cde428a0c25fea37e8466d27703e43e0a50/mediapipe/tasks/cc/vision/face_landmarker/face_blendshapes_graph.cc
    /// </summary>
    public class FaceBlendShape
    {
        /// <summary>
        /// landmark subset index, 0~467 is base face MeshLandmark, 468~472 is LeftIrisLandmark, 473~477 is RightIrisLandmark
        /// </summary>
        readonly List<int> landmarkSubsetIndex = new()
        {
            0,   1,   4,   5,   6,   7,   8,   10,  13,  14,  17,  21,  33,  37,  39,
            40,  46,  52,  53,  54,  55,  58,  61,  63,  65,  66,  67,  70,  78,  80,
            81,  82,  84,  87,  88,  91,  93,  95,  103, 105, 107, 109, 127, 132, 133,
            136, 144, 145, 146, 148, 149, 150, 152, 153, 154, 155, 157, 158, 159, 160,
            161, 162, 163, 168, 172, 173, 176, 178, 181, 185, 191, 195, 197, 234, 246,
            249, 251, 263, 267, 269, 270, 276, 282, 283, 284, 285, 288, 291, 293, 295,
            296, 297, 300, 308, 310, 311, 312, 314, 317, 318, 321, 323, 324, 332, 334,
            336, 338, 356, 361, 362, 365, 373, 374, 375, 377, 378, 379, 380, 381, 382,
            384, 385, 386, 387, 388, 389, 390, 397, 398, 400, 402, 405, 409, 415, 454,
            466, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477
        };

        protected readonly Interpreter interpreter;
        protected readonly int width;
        protected readonly int channels;
        protected readonly float[,] inputTensor;

        public const int BS_COUNT = 52;
        private float[] output0 = new float[BS_COUNT];

        public FaceBlendShape(string modelPath) 
        {
            var options = new InterpreterOptions();
            options.AddGpuDelegate();
            try
            {
                interpreter = new Interpreter(FileUtil.LoadFile(modelPath), options);
            }
            catch (System.Exception e)
            {
                interpreter?.Dispose();
                throw e;
            }

            interpreter.LogIOInfo();
            // Initialize inputs
            {
                var inputShape0 = interpreter.GetInputTensorInfo(0).shape;
                width = inputShape0[1];
                channels = inputShape0[2];
                inputTensor = new float[width, channels];

                int inputCount = interpreter.GetInputTensorCount();
                for (int i = 0; i < inputCount; i++)
                {
                    int[] shape = interpreter.GetInputTensorInfo(i).shape;
                    interpreter.ResizeInputTensor(i, shape);
                }
                interpreter.AllocateTensors();
            }
        }

        public virtual void Dispose()
        {
            interpreter?.Dispose();
        }

        public void Invoke(Vector3[] normalizedLandmarks)
        {

            GetLandmarksSubsetAsInput(normalizedLandmarks);

            interpreter.SetInputTensorData(0, inputTensor);
            interpreter.Invoke();
            interpreter.GetOutputTensorData(0, output0);
        }

        public float[] GetResult()
        {
            return output0;
        }

        /// <summary>
        /// Convert 468 3D landmark points to 146 2D landmark subset
        /// </summary>
        /// <param name="normalizedLandmarks"></param>
        /// <returns></returns>
        void GetLandmarksSubsetAsInput(Vector3[] normalizedLandmarks)
        {
            // Extend normalizedLandmarks to 478 points
            // Use left eye center as LeftIrisLandmark
            // Use right eye center as RightIrisLandmark
            var center = (normalizedLandmarks[159] + normalizedLandmarks[145]) / 2;
            var LeftEye = Enumerable.Repeat(center, 5).ToArray();
            center = (normalizedLandmarks[386] + normalizedLandmarks[374]) / 2;
            var RightEye = Enumerable.Repeat(center, 5).ToArray();

            List<Vector3> list = new();
            list.AddRange(normalizedLandmarks);
            list.AddRange(LeftEye);
            list.AddRange(RightEye);
            normalizedLandmarks = list.ToArray();

            // Get landmark subset as model input
            for (int i =0; i <landmarkSubsetIndex.Count; i++)
            {
                var ldmk = normalizedLandmarks[landmarkSubsetIndex[i]];
                inputTensor[i, 0] = ldmk.x;
                inputTensor[i, 1] = ldmk.y;
            }
        }
    }
}

Face blendshape and geometry support

Recently, Mediapipe released the new face landmark solution. It has new models such as "FaceDetector: 192 x 192, FaceMesh-V2: 256 x 256,Blendshape: 1 x 146 x 2". I tested on its web, face mesh is better than before. But, if you checked the source code about the new solution, you can find that it uses a face_landmarker_v2_with_blendshapes.task as backend model. I can not use it as x.tflite.

I tried compiling the code face_landmarker but failed. It alse has a solution for face_geometry.

Any good ideas to solve the problem?

asus4 commented 1 year ago

Hi @se7enXF,

the face_landmarker_v2_with_blendshapes.task is just a zip file. You will find 3 tflite files after unzip face_landmarker_v2_with_blendshapes.task 👍

Screenshot 2023-09-06 at 14 34 10

se7enXF commented 1 year ago

Thanks for reply @asus4 , I found out that it could be unzip after open this issue, but still not figure out how to cauculate face geometry accurately. The face_blendshapes.tflite could be used with my upper code with changing a little bit.

asus4 commented 1 year ago

@se7enXF Okay, can you please provide more context about what is wrong with the accuracy?

As I mentioned in Issue #306, I've confirmed that the blend shape with 3D landmarks worked, but the 3D depth is not projected correctly with the Unity camera.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.