google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
27.59k stars 5.16k forks source link

How to detect Emotions using Face Mesh Landmarks Mediapipe #3135

Closed msreevani060 closed 2 years ago

msreevani060 commented 2 years ago

Hi,

I have replicated the following document in my system :

https://www.analyticsvidhya.com/blog/2021/07/facial-landmark-detection-simplified-with-opencv/

With this could only get Facial Landmarks, not sure how to proceed with landmarks to get Euclidean distance for all 468 landmarks for emotion detection.

Can anyone please help me with guidance.

sureshdagooglecom commented 2 years ago

Hi @msreevani060 , you can calculate the euclidean distance landmarks using this method where you can pass the landmark points in arguments. def euclaideanDistance(point, point1): x, y = point x1, y1 = point1 distance = math.sqrt((x1 - x)2 + (y1 - y)2) return distance

msreevani060 commented 2 years ago

yes that i can do. But how to pass these distance scores into any neural networks/ any classification model.

Do you have any reference following this methodology ?

sureshdagooglecom commented 2 years ago

Hi @msreevani060 , kindly provide your code changes to investigate further on this.

google-ml-butler[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler[bot] commented 2 years ago

Closing as stale. Please reopen if you'd like to work on this further.

google-ml-butler[bot] commented 2 years ago

Are you satisfied with the resolution of your issue? Yes No

Oliver2552 commented 8 months ago

Hi,

Once you create a distance/graph matrix using your 468 landmarks, you should have a 468x468 matrix. You can then use that as input into a CNN, specifying an input shape of (468, 468, 1) (only 1 channel). Alternatively, you can also explore the use of a graph neural networks (GNN) or graph convolutional neural networks (GCNN).

That being said, while MediaPipe may offer 468 facial landmarks, using all of them may not be the best approach, computationally speaking - its expensive to compute. Consider narrowing down the number of landmarks you chose and see which of the 468 most effectively abstract emotions (e.g. eyes, nose, mouth, eyebrows etc..).