推理数据用的是音频？

JeremyCJM / DiffSHEG

[CVPR'24] DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven Holistic 3D Expression and Gesture Generation

https://jeremycjm.github.io/proj/DiffSHEG/

BSD 3-Clause "New" or "Revised" License

112 stars 9 forks source link

推理数据用的是音频？ #9

Closed beibidesr closed 3 months ago

beibidesr commented 3 months ago

这个任务只能用语音进行推理吗？那为什么演示的是人在说话呢？应该还要输入一张图片？

JeremyCJM commented 3 months ago

对的，这个任务是输入 speech，输出对应的 expression 和 gesture。这里的expression和gesture只是表示motion的数据（blenshape 和 joint rotation），要得到人物动画，需要用生成的motion来驱动or渲染出你所选择的数字人形象。详情可以参考我们的paper ：）

beibidesr commented 3 months ago

谢谢大佬，了解了