Open mvazquezgts opened 8 months ago
Hi @mvazquezgts,
Could you please provide additional information about the problem. Include the following details:
Providing this information will help us better understand and address the issue.
Thank you!!
OS: Ubuntu Programming Language: Python Version de Mediapipe: 0.10.11 Solution: Holistic
Given an input image/frame the output of the model is:
HolisticLandmarkerResult(face_landmarks=[ NormalizedLandmark(x=0.4745168089866638, y=0.36261075735092163, z=-0.0224269051104784, visibility=0.0, presence=0.0), .... NormalizedLandmark(x=0.5119104385375977, y=0.2810891270637512, z=0.005499151535332203, visibility=0.0, presence=0.0)],
pose_landmarks=[ NormalizedLandmark(x=0.47517403960227966, y=0.3143022358417511, z=-0.9151485562324524, visibility=0.9999208450317383, presence=0.9995543360710144), .... NormalizedLandmark(x=0.41832780838012695, y=1.8102238178253174, z=0.12485508620738983, visibility=0.005907772108912468, presence=0.001108874916099012)],
pose_world_landmarks=[
Landmark(x=-0.046613965183496475, y=-0.5604096055030823, z=-0.3200050890445709, visibility=0.9999208450317383, presence=0.9995543360710144), ... Landmark(x=-0.12133946269750595, y=0.5424543023109436, z=0.04660561680793762, visibility=0.005907772108912468, presence=0.001108874916099012)],
left_hand_landmarks=[ NormalizedLandmark(x=0.5576450228691101, y=0.7599831819534302, z=4.721105995031394e-07, visibility=0.0, presence=0.0), .... NormalizedLandmark(x=0.6063085794448853, y=0.5707101821899414, z=-0.08902209997177124, visibility=0.0, presence=0.0)],
left_hand_world_landmarks=[ Landmark(x=0.019411759451031685, y=-0.2692203223705292, z=-0.36530426144599915, visibility=0.0, presence=0.0), ..... Landmark(x=0.017838725820183754, y=-0.3276180922985077, z=-0.42397379875183105, visibility=0.0, presence=0.0)],
right_hand_landmarks=[ NormalizedLandmark(x=0.3994499146938324, y=0.7287973761558533, z=3.09612943283355e-07, visibility=0.0, presence=0.0), ..... NormalizedLandmark(x=0.3777098059654236, y=0.6123549938201904, z=-0.02483273483812809, visibility=0.0, presence=0.0)],
right_hand_world_landmarks=[ Landmark(x=-0.1434965282678604, y=-0.22600455582141876, z=-0.3554910123348236, visibility=0.0, presence=0.0), ..... Landmark(x=-0.14976395666599274, y=-0.30362075567245483, z=-0.39388307929039, visibility=0.0, presence=0.0)],
face_blendshapes=None, segmentation_mask=None)
The available data/fields are:
Both hand and face visibility and presence always gives 0, regardless of the configuration with which you set up/initialise the model. It only gives some information in pose_landmarks & pose_world_landmarks.
So, my question is whether this is a bug or if it is possible to get the confidence/visibility of the hand points in another way. In the hands model (documentation: https://developers.google.com/mediapipe/solutions/vision/hand_landmarker/python) I see that there is a 'Handedness' field that contains this information.
HandLandmarkerResult: Handedness: Categories #0: index : 0 score : 0.98396 categoryName : Left Landmarks: Landmark #0: x : 0.638852 y : 0.671197 z : -3.41E-7 Landmark #1: x : 0.634599 y : 0.536441 z : -0.06984 ... (21 landmarks for a hand) WorldLandmarks: Landmark #0: x : 0.067485 y : 0.031084 z : 0.055223 Landmark #1: x : 0.063209 y : -0.00382 z : 0.020920
But it seems that in this version of Holistic there is no way to get a score for hand points.
Thank you for raising this. As this is our newest Task, we likely need to invest a bit more time here.
Its an old issue:
I am experiencing the same issue (holistic task on web), any updates on this?
I am experiencing the same issue (holistic task on web), any updates on this?
wow this is really unprofessional from you guys... there are multiple issues about this. Can someone make a proper explanation about the state of visibility/confidence of pose, face and hand landmark detection @schmidt-sebastian @kuaashish
can you at least please confirm that every landmark will be predicted even if it is not present in the image. If so, can it ever be out of bounds of the base image? @schmidt-sebastian @kuaashish https://github.com/google-ai-edge/mediapipe/issues/3159
Have I written custom code (as opposed to using a stock example script provided in MediaPipe)
No
OS Platform and Distribution
Ubuntu
MediaPipe Tasks SDK version
Holistic
Task name (e.g. Image classification, Gesture recognition etc.)
Holistic
Programming Language and version (e.g. C++, Python, Java)
Python
Describe the actual behavior
In the actual version: Info about the visibility/confidence of keypoints from the hands is not available.
Describe the expected behaviour
Give information about the confidence of the keypoints of the hands extracted
Standalone code/steps you may have used to try to get what you need
Other info / Complete Logs