Open Bersaelor opened 1 year ago
This problem is also replicateble using the default demo from the mediapipe homepage.
Basic 2d-Mesh of face: (My head is in the first 25% of the webcam image, the screen is just cut since the rest is black)
Now, choosing the avatar while keeping my head perfectly still:
You can see the avatar pointing to the left, even though my face was parallel to the image plane.
To clarify: I'm really impressed with the performance of mediapipe, infering those 478 landmarks in 60fps in the browser with no problem.
Many models and setups I tried before, were much less performant so I was really happy when I discovered mediapipes latest advancements.
Whats curious to me is that the pose estimation based on only a few key points was something other frameworks achieved with rather low effort and good results.
For example, I have seen pose estimation using just 6 points (left_eye, right_eye, nose_tip, subnose, l_ear, r_ear) and solvePnP
from the OpenCV calib3D module and the result predicted the face pose really well.
So thats why I'm suprised that the pose estimation of mediapipe/face_landmarks
had the above described issues.
Hi, I'm facing the same issue, and it seems like MediaPipe calculates face landmarks in relation to the camera, while in Three.js, it's aligned with the camera's side rather than the camera itself. Here's a visual to help you understand:
let me know if you found anything.
Here's a visual to help you understand:
Yes, that is a good illustration. But I would like to point out that the issue also happens with the mediapipe demo project, which is using some vanilla WebGL 3d graphic renderer, if I understand correctly. So this issue isn't tied to Three.js directly.
@schmidt-sebastian @kuaashish @yichunk any update or progress on this issue now that it's been a month?
I fear while this error exists, the mediapipe solution for face-detection could only be uses for toy-examples, as even the demo on the website doesn't work when the users face isn't centered.
Will get back to you when this is fixed, but unfortunately cannot promise a timeline yet.
@Bersaelor Hello, I encountered the exact same issue when using facial_transformation_matrixes to transform a model to the camera coordinate system. I noticed that the rendering result is misaligned(even 2D landmark results seems correct). Do you have any solutions for this problem?
Dear biggies, sorry to bother, I am solving a problem that might similar to yours. So would you please help?
I currenty have NormalizedLandmark_A
(which could be drawn on image once mulplied by image_A's width and image_A's height) and facial_transformation_matrixes_A
(a 4x4 ndarray). I need to put this expression onto another reference image_B using its facial_transformation_matrixes_B
. Is there any suggestion for me to do this? Thank you.
Have I written custom code (as opposed to using a stock example script provided in MediaPipe)
Yes
OS Platform and Distribution
Web/Chrome/JS
Programming Language and version
JS/TS
MediaPipe version
0.10.4
Bazel version
No response
Solution
n.A.
Android Studio, NDK, SDK versions (if issue is related to building in Android environment)
No response
Xcode & Tulsi version (if issue is related to building for iOS)
No response
Describe the actual behavior
The 2D-landmarks in
results.faceLandmarks
looks correct, as the 2D canvas that is drawn usingdrawingUtils.drawConnectors
looks in the right spot.But when putting 3D objects on the face, using
facialTransformationMatrixes
in a Face Landmarker application, the 3D content seems to be skewed when not centered. I.e. Center: The forward vector z points straight out of the screenLeft of image: The forward (z) vector points left, even though the face is still parallel to the image.
Right of image: The forward (z) vector points right, even though the face is still parallel to the image.
Describe the expected behaviour
Regardless of where the face is in the frame, the
facialTransformationMatrixes
should be a transformation from the canonical face to the predicted metric face in the 3D space.Standalone code/steps you may have used to try to get what you need
My scene is created in
THREE.js
the camera is setup as:(with the position at 0,0,0 looking into the negative z direction, as described in the docs)
The point of the face position is (and has a little axishelper in it for visualization).