pkhungurn / talking-head-anime-3-demo

Demo Programs for the "Talking Head(?) Anime from a Single Image 3: Now the Body Too" Project
http://pkhungurn.github.io/talking-head-anime-3/
MIT License
934 stars 99 forks source link

Blendshapes Interface #14

Closed skyler14 closed 1 year ago

skyler14 commented 1 year ago

This is a pretty cool project, I just was wondering if you could describe the interface for the motion capture a bit for people who want to build off off it. I'm playing around with other ways of parsing 3d data such as mediapipe with https://github.com/yeemachine/kalidokit to be able to parse on desktop.

Since the render pipeline is taking blendshapes, is it operating on Live2D like standards? I've looked at ifacialmocap_puppeteer.py and ifacialposeconverter25 a bit, and just wasn't quite understanding how what the pose data you get from ifacial ends up looking like. At what method can I just swap feeding in raw blendshape data from my own local service and everything should operate normally? Are there any other auxiliary things from ifacial that are important I need to also include for the GUI to work?

Also where in the code are the various generated neural network outputs layered on top of each other for avatar creation?

pkhungurn commented 1 year ago

Thank you for your interest in the project.

Q: Since the render pipeline is taking blendshapes, is it operating on Live2D like standards? A: No. It takes the blendshape parameters produced by Apple blendShapes API. https://developer.apple.com/documentation/arkit/arfaceanchor/2928251-blendshapes

Q: "just wasn't quite understanding how what the pose data you get from ifacial ends up looking like." A: See https://github.com/pkhungurn/talking-head-anime-3-demo/blob/main/tha3/mocap/ifacialmocap_v2.py.

Q: At what method can I just swap feeding in raw blendshape data from my own local service and everything should operate normally? A: The UI programs rely on an instance of GeneralPoser02 class (https://github.com/pkhungurn/talking-head-anime-3-demo/blob/main/tha3/poser/general_poser_02.py#L11), which has a method "pose" (https://github.com/pkhungurn/talking-head-anime-3-demo/blob/main/tha3/poser/general_poser_02.py#L58), which takes in an image, a pose vector, and an output index. This is the main method that invokes the neural networks.

This is where it is used in ifacialmocap_puppeteer: https://github.com/pkhungurn/talking-head-anime-3-demo/blob/main/tha3/app/ifacialmocap_puppeteer.py#L327, and this is where it is used in the manual_poser: https://github.com/pkhungurn/talking-head-anime-3-demo/blob/main/tha3/app/manual_poser.py#L387

From the manual_poser, you can backtrack and see what the pose vector look like.

From the ifacialmocap_puppeteer, you can backtrack to see how a pose vector is derived from inputs from iFacialMocap. This of course uses IFacialMocapPoserConverter25 (https://github.com/pkhungurn/talking-head-anime-3-demo/blob/main/tha3/mocap/ifacialmocap_poser_converter_25.py#L82) at some point.

Q: Also where in the code are the various generated neural network outputs layered on top of each other for avatar creation? A: See https://github.com/pkhungurn/talking-head-anime-3-demo/blob/main/tha3/poser/modes/separable_float.py#L51

skyler14 commented 1 year ago

Thanks for the help, if you'd enable discussions for this repo and move this to discussion I think there might be useful stuff as people come across the project they'll contribute and I'll probably continue to pose questions and results down the line.