iris tracking using mediapipe

lghasemzadeh commented 3 years ago

Hello,

I have recently find this library: https://blog.tensorflow.org/2020/11/iris-landmark-tracking-in-browser-with-MediaPipe-and-TensorFlowJS.html and my concern is to use its updated version, which does the iris detection (in the webpage, the second GIF, the comparison of the older and newer version), I want to use the right side one because:

1) the model works more robust in this version 2) there is no shaking in face mesh 3) no facemesh disappearing during extreme head poses 4) does iris detection 5) it shows how eyes are closed or open via eye region landmarks, when eyes are closed the eye landmarks come very close to each other to show that eyelids are closed.

But this library is for JavaScript which I don't know that language and of course I can't install and use it in my python scripts. I am looking for its python version or a converter to convert it from JS to python.

Would you please help/guide me through this issue? Does your model give me what I am looking for?

Thank you

yinguobing commented 3 years ago

Hi @lghasemzadeh

I think the iris model is an updated version of the previous landmark model. I need some time to figure it out.

Meanwhile, it seems Google had provieded a Python API for Mediapipe: https://google.github.io/mediapipe/solutions/face_mesh#python-solution-api

I think the official python API is a better solution. Would you like to try that first?

lghasemzadeh commented 3 years ago

@yinguobing Thank you for your response. The link you sent is the older version (as far as I understood) which doesn't provide iris tracking. I recorded 3 videos from 3 different demos, I will send them to you trough your gmail to show what I am exactly looking for. Please check it out. Thanks

yinguobing commented 3 years ago

Hi @lghasemzadeh

Do not worry you had described the issue quite well.

I had watched the videos in the mail. After a shot investgation, I found something interesting. But before that we need to clairfy some concepts that will be mentioned later.

First, mediapipe. This is a collection of ML solutions provided by Google, which support multiple running enviourments and various programming languages. Mediapipe is not a model. But it does host many deep learning models that fulfill different tasks. Iris detection is one of them. Mediapipe also provides some other alghrithms besides deep learning models.

Second, TensorFlow.js. From the official description, "TensorFlow.js is a library for machine learning in JavaScript". This is not a model either. You can think of it as a bridge to let the web browser run the deep learning model directly.

At last, the model. Most of the time we use 'model' to refer the deep neural network architectures, like "ResNet", "MobileNet", etc. When you hear people talking about "training" the models, they are very likely to talking about these architectures.

In the blog post from Google you linked above, the authors wrote:

Today, we’re excited to add iris tracking to this package through the TensorFlow.js face landmarks detection model. This work is made possible by the MediaPipe Iris model. We have deprecated the original facemesh model, and future updates will be made to the face landmarks detection model.

There are three "models": face landmarks detection model, Iris model and facemesh model. This could be very confusing as they are all named "models". But follow the hyperlinks it's not hard to tell that

face landmarks detection model is a NPM package.
Iris model is a solution of Mediapipe.
facemesh model is another NPM package.

So what the authors wanted to say is like "TensorFlow.js users, we are providing a new NPM package named face landmarks detection which also have iris detection features. Do not use the old facemsh package any more."

Does this make any sense to you?

If the answer is yes, let's move on to the actual models.

Mediapipe is a collection of solutions. Iris detection is one of them. It is called a solution rather than a model because one solution could rely on multiple deep learning models and some other dependencies. From the offcial document you can find that there are actually three models involved for iris detection: Face Detection Model, Face Landmark Model and Iris Landmark Model.

At first Mediapipe did not provide Python APIs and that's the reason why I opensourced this repo: to let people run face mesh detection with python. However it seems that Google is providing python APIs for face landmarks now. I don't know when this happend but it is a good news for us. And we should stick to the official python API especially for application development.

Unfortunately, face detection and iris detection currently do not have python APIs. It could take a while before we figuring out how to run them with python. The good news is that they all support C++. If you are really in a hurry and don't mind writting some C++ code, this could be a good start point.

Best,

lghasemzadeh commented 3 years ago

@yinguobing thank you very much for the very detailed explanation :) I learned and got the answer of lots of my questions.

Yes I am in hurry and really need facemesh together with iris detection. But I don't know C++, languages that I know are Python, Matlab and R. Is there any other solution comes to your mind?

I talked to the developer of the third link that I sent you (Human), and he said I need to convert TensorFlow.js to Python (tfjs to tf) or I have to find a saved model. But I don't know exactly what they are and how to do? and does that converted or saved model give me the thing that I am looking for? what are their outputs? do they work the same as original ones?

yinguobing commented 3 years ago

SavedModel is a kind of file format that TensorFlow use to store a neural network. But if you pay close attention you would find that mediapipe does not provide these kind of models and I don't think there is any chance we could get one. Instead they only track the TFLite model on GitHub.

Inference of TFLite model is possible with TensorFlow Lite. This is exactly what I've done in this repository. But it could also be hard because the model is only part of the solution, you need to figure out:

is there any preprocessing need to be done, like normalization, etc?
what is the input data? What formats do they have? Like dimentions, datatype, channel first or last.
how the three models are chained together?

These information are hidden in the C++ code of mediapipe. Translate these code into Python you should get an identical result.

lghasemzadeh commented 3 years ago

Hello @yinguobing thank you for the answer, I really enjoy learning and discuss with people who are working in this field :)

Regarding the issue we discussed above, I am trying to ask it in different forums to find my way and learn. here is one of them: https://github.com/tensorflow/tfjs/issues/4395 would you please check and let me know your idea about the converter that is suggested there?

yinguobing commented 3 years ago

That's interesting! I don't know JS but it seem that TensorFlow.js is using a dirrerent format for model saving?

lghasemzadeh commented 3 years ago

yes, this is the problem that I don't know JS either :) If I found sth I will let you know as well. at least we can discuss about things and I will learn from you.

Thank you very much for your help

lghasemzadeh commented 3 years ago

Hello @yinguobing I got a bit confused!

I decided to learn JS but when I got to install the iris tracking in solutions of mediapipe I saw in the table that iris detection is not provided for JS!!!!!

Are these two links talk about a single package? Are these two libraries the same? First Second If yes, why their names are different? and the important issue is that why one of them provided for JS but the other didn't. If no, is the main library is the first link and in the second one this is the same library just it is the JS version of it?

The first link clearly demonstrate that the library is for JS but the second link when we go to solutions,there is a table which shows the iris detection is not provided for JS!!!

Screenshot from 2021-02-17 09-00-07

Would you please help to fıgure out what is going on?

yinguobing commented 3 years ago

I think the absence of the IRIS checkmark for JS means "Iris detection is not supported solely in JS. It is part of the face landmark solution".

lghasemzadeh commented 3 years ago

Thank you :)

yinguobing commented 3 years ago

Just curious, have you solved this issue?

lghasemzadeh commented 1 year ago

Hello,

I stop the investigation for a while. today I checked the mediapipe website and I saw that they change the website and repos, and they launch a new version of mediapipe which provides iris solution for python as well. Now the problem is I don't want to install new version of mediapipe :D . I want to stay with mediapipe 0.8.3 which doesn't have iris solution for python (referring to the table above). I want to access the iris solution of 0.8.3 using C++. I only want to run a C++ example iris solution. nothing more. But since the website is changed completely, I don't know how to do so. Do you have any idea?

Thank you

yinguobing commented 1 year ago

Try these:

Dig out the model file mediapipe used. Find out in what format it was exported (Maybe TFLite?).
Convert the exported model into ONNX format. https://github.com/onnx/onnxmltools
Run model inference with ONNXRuntime. https://onnxruntime.ai/

In this way, you no longer need mediapipe any more.

yinguobing / face-mesh-generator

iris tracking using mediapipe #4