google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://mediapipe.dev
Apache License 2.0
26.76k stars 5.08k forks source link

Holistic Avatar demo #3491

Closed sronen71 closed 1 year ago

sronen71 commented 2 years ago

In Grishchenko et al (2022): BlazePose GHUM Holistic: Real-time 3D Human Landmarks and Pose Estimation The authors write: " To showcase BlazePose GHUM Holistic we created an open-source avatar demo (see Fig. 4) in MediaPipe (https://mediapipe.dev). The demo is available for web browsers and allows to control body and hands of a standard Mixamo avatar at 15 FPS". The link was to mediapipe's home page and not to the demo itself. Can you point us to the demo and to the demo's open sourced code? Thank you!

kuaashish commented 2 years ago

Hi @sronen71, Have you looked at 3D Pose Detection with MediaPipe BlazePose GHUM and TensorFlow.js blog with demo and clearly mentioned steps to implement in JavaScript. Thank you!

sronen71 commented 2 years ago

Hi @kuaashish, thanks for the pointer. I found a link to a demo (https://3d.kalidoface.com/) within your link. However, this is not the same as the Mixamo Avatar demo referred in the paper and said to be open source. My question is about that specific demo: image

image

MTdunno commented 2 years ago

In addition to being unable to find the source, I am unable to find the demo at all. There does not seem to be a mention of Mixamo at all as demonstrated by this search

sronen71 commented 2 years ago

I've contacted the authors and they say this is not available yet. Planned to be released later this year.

MTdunno commented 2 years ago

I appreciate the update @sronen71. Do you have a way you're solving this in the meantime?

sronen71 commented 2 years ago

@MTdunno for now I've modified mediapipe holistic graph to expose the 3D hand world land marks. Then, in the python solution, I merge the pose and hand world landmarks. Since the pose's world landmarks are hip centered and hand's world landmarks are hand centered, I ended translating the hands such that their wrist at same 3D position as the the body pose wrist.

kuaashish commented 2 years ago

Hi @sronen71, Thank you for pointing out this documentation bug. it seems the problem with mediapipe.dev does not include the demo that was promised. we are forwarding this issue to the appropriate team. Hope this will be solved soon. Thank you!

seastar105 commented 1 year ago

@kuaashish Any updates?

UtsaChattopadhyay commented 1 year ago

Hi @sronen71 . I am trying to translate the hands such that their wrist at same 3D position as the the body pose wrist. May I know how u did that or you have any script for that? Thanks a lot.

MTdunno commented 1 year ago

@chuoling Are there any updates on this? accordingto sronen71's comment, this demo is planned to be released by the end of the year (19 days) and I still so not see a place for this on https://mediapipe.dev

xiang-zhe commented 1 year ago

Hi, @sronen71, @kuaashish

for now I've modified mediapipe holistic graph to expose the 3D hand world land marks.

It's weird that holistic only output pose-world-landmark and not left/right-hand-world-landmark, and I didn't see that and formula or function can convert hand_landmark into hand_world_landmark, @sronen71 could you share your code about it, thank you very much. otherwise, I need to run holistic then run hand to get pose_world_landmark and hand_world_landmark.

xiang-zhe commented 1 year ago

Hi, @sronen71, I see that your code(https://github.com/sronen71/pose/blob/master/holistic.py) has a function to_array() which has pose_world_landmark and hand_world_landmark, but that seems you modified the source C++ and build mediapipe by yourself. And I am not familiar with C++, whether it exist a python way to get both pose_world_landmark and hand_world_landmark. Otherwise, I need run hand solution and holistic solution to get those world_landmark and face mesh. Any help will be great appreciated!

xiang-zhe commented 1 year ago

And finally, I got the hand_world_landmarks from holistic(Python,desktop, ubuntu2004), and record it here. According to the @sronen71(https://github.com/sronen71/mediapipe-holistic), 1, modified holistic_landmark_cpu.pbtxt, hand_landmarks_left_and_right_cpu.pbtxt, hand_landmarks_from_pose_cpu.pbtxt.(in fact, I not familiar with .pbtxt file) 1.1 holistic_landmark_cpu.pbtxt

# 21 left hand landmarks. (NormalizedLandmarkList)
output_stream: "LEFT_HAND_LANDMARKS:left_hand_landmarks"
+ output_stream: "LEFT_HAND_WORLD_LANDMARKS:left_hand_world_landmarks"
# 21 right hand landmarks. (NormalizedLandmarkList)
output_stream: "RIGHT_HAND_LANDMARKS:right_hand_landmarks"
+ output_stream: "RIGHT_HAND_WORLD_LANDMARKS:right_hand_world_landmarks"
...
# Predicts left and right hand landmarks based on the initial pose landmarks.
node {
  calculator: "HandLandmarksLeftAndRightCpu"
  input_stream: "IMAGE:image"
  input_stream: "POSE_LANDMARKS:pose_landmarks"
  output_stream: "LEFT_HAND_LANDMARKS:left_hand_landmarks"
  output_stream: "RIGHT_HAND_LANDMARKS:right_hand_landmarks"
  + output_stream: "LEFT_HAND_WORLD_LANDMARKS:left_hand_world_landmarks"
  + output_stream: "RIGHT_HAND_WORLD_LANDMARKS:right_hand_world_landmarks"
}

1.2 hand_landmarks_left_and_right_cpu.pbtxt

# Left hand landmarks. (NormalizedLandmarkList)
output_stream: "LEFT_HAND_LANDMARKS:left_hand_landmarks"
+ output_stream: "LEFT_HAND_WORLD_LANDMARKS:left_hand_world_landmarks"
# RIght hand landmarks. (NormalizedLandmarkList)
output_stream: "RIGHT_HAND_LANDMARKS:right_hand_landmarks"
+ output_stream: "RIGHT_HAND_WORLD_LANDMARKS:right_hand_world_landmarks"
...
# Predicts left hand landmarks.
node {
  calculator: "HandLandmarksFromPoseCpu"
  input_stream: "IMAGE:input_video"
  input_stream: "HAND_LANDMARKS_FROM_POSE:left_hand_landmarks_from_pose"
  output_stream: "HAND_LANDMARKS:left_hand_landmarks"
  + output_stream: "HAND_WORLD_LANDMARKS:left_hand_world_landmarks"
  # Debug outputs.
  output_stream: "HAND_ROI_FROM_POSE:left_hand_roi_from_pose"
  output_stream: "HAND_ROI_FROM_RECROP:left_hand_roi_from_recrop"
  output_stream: "HAND_TRACKING_ROI:left_hand_tracking_roi"
}
...
# Extracts right-hand-related landmarks from the pose landmarks.
node {
  calculator: "HandLandmarksFromPoseCpu"
  input_stream: "IMAGE:input_video"
  input_stream: "HAND_LANDMARKS_FROM_POSE:right_hand_landmarks_from_pose"
  output_stream: "HAND_LANDMARKS:right_hand_landmarks"
  + output_stream: "HAND_WORLD_LANDMARKS:right_hand_world_landmarks"
  # Debug outputs.
  output_stream: "HAND_ROI_FROM_POSE:right_hand_roi_from_pose"
  output_stream: "HAND_ROI_FROM_RECROP:right_hand_roi_from_recrop"
  output_stream: "HAND_TRACKING_ROI:right_hand_tracking_roi"
}

1.3 hand_landmarks_from_pose_cpu.pbtxt.

# Hand landmarks. (NormalizedLandmarkList)
output_stream: "HAND_LANDMARKS:hand_landmarks"
+ output_stream: "HAND_WORLD_LANDMARKS:hand_world_landmarks"
...
# Predicts hand landmarks from the tracking rectangle.
node {
  calculator: "HandLandmarkCpu"
  input_stream: "IMAGE:input_video"
  input_stream: "ROI:hand_tracking_roi"
  output_stream: "LANDMARKS:hand_landmarks"
  + output_stream: "WORLD_LANDMARKS:hand_world_landmarks"
}

2 then build python package(https://google.github.io/mediapipe/getting_started/python.html#building-mediapipe-python-package) 2.1 bazel installation, I also got some problem and solved(https://github.com/bazelbuild/bazelisk/issues/99#issuecomment-1352641435) 2.2 then build package 3 after build, 3.1, i got a error(ERROR:import mediapipe.tasks.python as tasks) when import mediapipe as mp. the init.py(./python_env/mediapipe/lib/python3.8/site-packages/mediapipe/initpy) has some code like that

# Copyright 2019 - 2022 The MediaPipe Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from mediapipe.python import *
import mediapipe.python.solutions as solutions 
import mediapipe.tasks.python as tasks

del framework
del gpu
del modules
del python
del mediapipe
del util
__version__ = 'dev'

from mediapipe.python import *
import mediapipe.python.solutions as solutions 
import mediapipe.tasks.python as tasks

del framework
del gpu
del modules
del python
del mediapipe
del util
__version__ = 'dev'

from mediapipe.python import *
import mediapipe.python.solutions as solutions 
import mediapipe.tasks.python as tasks

del framework
del gpu
del modules
del python
del mediapipe
del util
__version__ = 'dev'

from mediapipe.python import *
import mediapipe.python.solutions as solutions 
import mediapipe.tasks.python as tasks

del framework
del gpu
del modules
del python
del mediapipe
del util
__version__ = 'dev'

from mediapipe.python import *
import mediapipe.python.solutions as solutions 
import mediapipe.tasks.python as tasks

del framework
del gpu
del modules
del python
del mediapipe
del util
__version__ = 'dev'

from mediapipe.python import *
import mediapipe.python.solutions as solutions 
import mediapipe.tasks.python as tasks

del framework
del gpu
del modules
del python
del mediapipe
del util
__version__ = 'dev'

after compared to official init.py, I directly deleted those

import mediapipe.tasks.python as tasks

del framework
del gpu
del modules
del python
del mediapipe
del util
__version__ = 'dev'

it works fine. 3.2 we can see that size of holistic_landmark_cpu.binarypb(./python_env/mediapipe/lib/python3.8/site-packages/mediapipe/modules/holistic_landmark) from 1370 bytes to 1586 bytes((in fact, I not familiar with .binarypb file)); then modifed ./python_env/mediapipe/lib/python3.8/site-packages/mediapipe/python/solutions/holistic.py-line131-135, add the hand_world_landmarks.

        outputs=[
            'pose_landmarks', 'pose_world_landmarks', 'left_hand_landmarks',
            'right_hand_landmarks', 'left_hand_world_landmarks',
            'right_hand_world_landmarks', 'face_landmarks', 'segmentation_mask'
        ])

then holistic can output hand_world_landmark.

JpEncausse commented 1 year ago

Hello, I give a try to Kalidokit / Holistic pose with MediaPipe, ThreeJS and ReadyPlayer Me avatar but it seems the avatar is all messed-up even if I follow the script.js code.

Demo: https://jsfiddle.net/nxg5bp2h/1/

Stack Overflow: https://stackoverflow.com/questions/74852177/avatar-pupettering-with-threejs-readyplayerme-kalidokit-and-mediapipe

I also try so other code/libraries but I fail at setting correct values to the avatar. Can someone point me to the right direction ?

JpEncausse commented 1 year ago

According Kalidokit there is some work directly on Mediapipe for VRM, Mixamo, and ReadyPlayerMe models. Any chance to see that demo and the JS code ? The most difficult part is to have/understand the correct rotation/position for ReadyPlayerMe (it would help me a lot to move forward)

sronen71 commented 1 year ago

I uploaded my solution to github some time ago. As @xiang-zhe found and describes above. I modified media pipe to expose the hand world coordinates and I modified the holistic python to use it. https://github.com/sronen71/mediapipe-holistic https://github.com/sronen71/pose/blob/master/holistic.py

asifshaha commented 1 year ago

Hi, I am trying to map Holistic mediapipe pose landmarks to Ready Me Player avatar but facing issues with angle mapping I followed Kalidokit approach for finding 2d angles and 3d angle and mapping it to avatar mesh. Please can anybody provides inputs on this

lucasjinreal commented 1 year ago

I have same issue, does driven mixamo any demo available now? it's 2023

kuaashish commented 1 year ago

This is a legacy solution And We will be not able to fix this issue.

github-actions[bot] commented 1 year ago

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

github-actions[bot] commented 1 year ago

This issue was closed due to lack of activity after being marked stale for past 7 days.

google-ml-butler[bot] commented 1 year ago

Are you satisfied with the resolution of your issue? Yes No