geaxgx / depthai_handface

Running Google Mediapipe Face Mesh and Hand Tracking models on Luxonis DepthAI devices
MIT License
57 stars 10 forks source link

Output skeleton to 3D format #1

Open felipemeres opened 2 years ago

felipemeres commented 2 years ago

Thanks for all of the amazing work on Depthai applications! I'm trying to use my OAK-D Pro to make animated skeletons that I can use to animate characters but I haven't been able to figure out a way to write the captured location data from your scripts into a format that I could use in a 3D program.

The closest I got was with your Blender implementation but when the capture process is running it doesn't seem to update the joint transformations in realtime and that prevents me from recording the animation in keyframes.

I would really appreciate any advice you may have on how I could get the location data into a 3D format or even to a CSV, which I could then translate into an animated skeleton in either a .bhv or .fbx file.

I found this Blender plugin that does a great job at recording mocap data from a webcam but I'm sure your scripts that make use of the OAK cameras would produce much better results. https://github.com/cgtinker/BlendArMocap

geaxgx commented 2 years ago

Unfortunately, I don't know well enough blender or other 3D programs to help you. I guess you need some one-to-one relationships between the bones of the mediapipe skeleton and the bones of the blender rig. But on one side we have bone extremities location (the landmarks) and we want to translate these in relative bone rotations. And it does not seem to me an easy task. The BlendArMocap plugin looks really great. If I was trying to do a project like yours, I would study how this plugin works. As it relies also on Mediapipe, the results should be as good or even better (body pose estimation gives more accurate results on cpu than on myriadx).

felipemeres commented 2 years ago

Thank you for the response. The BlendArMocap plugin is really great!

I think your Blender script is very close to being able to write a skeleton file since the only missing component is the ability to record the data that is already being applied to the Vincent Rig into keyframes. The OpenCV script of the video you mentioned in the Readme for the script enables the recording of the data being piped into the Rig (you can see it being demonstrated in this part of the video). I've spent the past couple of days comparing this script to yours to identify what tweak needs to be done to enable that but so far I haven't had much success. But once that is figured out it would be very easy to export the animated skeleton from Blender and use it as mocap data in other programs and for different character rigs.

geaxgx commented 2 years ago

Sorry, it is not clear to me. The OpenCV script you mentioned is calling keyframe_insert to record the animation: https://github.com/jkirsons/FacialMotionCapture/blob/6ead5e3eede4981b2a019833aec7dd8c04231461/OpenCVAnimOperator.py#L131

I can add similar calls in my code to do the same but I am not sure that it records the information you want. For instance, to make Vincent smile, we rotate the proxy bone "mouth_ctrl". By acting on this one virtual bone, we indirectly act on the many bones that constitute the lips. The "mouth_ctrl" bone is particular to Vincent. If you record its keyframes, I don't see how you can apply them to other character rigs.

felipemeres commented 2 years ago

Thank you so much for looking into this and pointing out that line of code.

Once we have the keyframes for the drivers/proxy controls we can bake the influence they have over the entire rig to each bone (Pose/Animation/Bake Action). With the animation of each bone baked we can retarget the animation to any different skeleton, by setting up relationships across the skeletons' bones. I usually retarget in Houdini (here is a good explanation of the process if you are interested) but Blender also has some great tools to do it, such as the Rokoko plugin.