google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://mediapipe.dev
Apache License 2.0
26.25k stars 5.05k forks source link

Help in finding the right projection matrix used at runtime #4977

Open cideck opened 7 months ago

cideck commented 7 months ago

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

None

OS Platform and Distribution

Web / Javascript

MediaPipe Tasks SDK version

v0.10.8 (latest)

Task name (e.g. Image classification, Gesture recognition etc.)

Face Landmarker

Programming Language and version (e.g. C++, Python, Java)

Javascript

Describe the actual behavior

Hello @kostyaby,

Thank you so much for providing very extensive and detailed explanation about how to use and combine both the "transformation matrix" and the "projection matrix" to transform the canonical face mesh model into actual screen coordinates. I found this post very informative: https://github.com/google/mediapipe/issues/1642#issuecomment-794572140

FaceLandmarker.detectForVideo provides the facialTransformationMatrixes. But now, I'm struggling with finding the actual projection matrix that should be used. I found how to create a projection matrix here, and I know the aspect ratio of my webcam, but this code requires 3 additional parameters: near, far and fov: https://github.com/google/mediapipe/blob/master/mediapipe/modules/face_geometry/libs/effect_renderer.cc#L573-L599

Searching in MediaPipe's code, I did not find the params near, far and fov. Is there any API to find them, or a recommended way to calculate them?

Describe the expected behaviour

More detailed documentation or a javascript API to get appropriate projection matrix

Standalone code/steps you may have used to try to get what you need

Other info / Complete Logs

No response

kuaashish commented 7 months ago

Hi @schmidt-sebastian,

Could you kindly review this issue? Thank you