What are the input formats for the three conditions of the denoising module？

XingliangJin / MCM-LDM

[CVPR 2024] Arbitrary Motion Style Transfer with Multi-condition Motion Latent Diffusion Model

21 stars 4 forks source link

What are the input formats for the three conditions of the denoising module？ #7

Open mengW6 opened 1 month ago

mengW6 commented 1 month ago

Hello, this is a very good job. I have a question that I hope to receive your answer to.What are the input formats for the three conditions (content, style, trajectory) of the denoising module？Is it entered in text format?

XingliangJin commented 1 month ago

Thank you for asking.

The input format of these three conditions are encoded features (content: (bs, 6, 256), style: (bs, 1, 256), trajectory: (bs, 1, 256)) from content motion and style motion. We use this diagram to represent the main information contained in these features. We apologize for the misunderstandings.

Please feel free to ask if you have any other questions.

mengW6 commented 1 month ago

Thank you for your reply！If the content is replaced with a dance sequence and the style is changed to various dance styles such as ethnic dance, hip-hop, etc., does it meet the input requirements of the denoising module？

XingliangJin commented 1 month ago

I haven't done similar experiments on dance motions. I think you can try some examples : )