ChenFengYe / motion-latent-diffusion

[CVPR 2023] Executing your Commands via Motion Diffusion in Latent Space, a fast and high-quality motion diffusion model
https://chenxin.tech/mld/
MIT License
565 stars 50 forks source link

Visualization and understanding of latent space #12

Open ChenFengYe opened 1 year ago

ChenFengYe commented 1 year ago

Related to #7. From mmdrahmani,

I also have another basic question. I would like to understand the latent dimension of VAE. I'd like to know what the model has learned. Essentially, I am assuming if we could visualize the latent dimension, different actions would be clustered in different locations of the latent space. For example see the figure attached, for my analysis on the latent dimension of a simple VAE using mnist data. As you can see, the 10 digits are clearly clustered. I hope this kind of analysis is possible with mld-vae. (maybe I should open a new issue?) image

ChenFengYe commented 1 year ago

We think motion VAE could extract the motion principal features, just like PCA does. Thus, we try to visualize the motion latent space like below. We will update this part to our arxiv paper and camera-ready version soon.

From left to right, it shows the evolved latent codes during the inference of diffusion models. However, this is only an intuitive figure of the visualization of latent diffusion (only 30 motions). We also try to visualize much more motion results (like 300 or 3000). Unfortunately, the classification is not that obvious.

image

mmdrahmani commented 1 year ago

This is amazing! Thank you for sharing such valuable results here! If you get me started on this, (like technically how you extracted these features, and so on), I would be happy to contribute!

ChenFengYe commented 1 year ago

Of course, we will upload the source code for this visualization soon.

mmdrahmani commented 1 year ago

Thanks a lot!

ChenFengYe commented 1 year ago

Hi mmdrahmani, we upload the script for this latent visualization. You can find the detail in FAQ of GitHub page.