ubisoft / ubisoft-laforge-ZeroEGGS

All about ZeroEggs
Other
375 stars 60 forks source link

Cannot reproduce style embedding finetuning using PCA method #46

Closed Miru302 closed 1 year ago

Miru302 commented 1 year ago

In the paper, it's says it's possible to achieve a level of meaningful control over final animation style by:

  1. finding Principal Components from set of style embeddings that style encoder produces, and
  2. manipulating a style embedding vector along Principal Component axis.

I've computed style encodings for each animation in zeggs dataset (134 .bvh files in ./data/clean folder, using original code (seed 1234)). I've computed PCA with n_components=2 for resulting dataset using sklearn.decomposition.PCA and pca.fit_transform While plotting results I noticed that clustering of points, while still noticeable, was not that pronounced as in the paper. (I also noticed that PC1 and PC2 were merely capturing 0.08 of variance combined, hinting that encoder is adept at evenly spreading information)

By converting an arbitrary point from PCA space back into style embedding I've been getting mixed results. Sometimes it did resemble nearby points, but sometimes it was noticeably different even for the same point from dataset. I'm guessing that loss of information during dimentionality reduction was too big to reliably reconstruct embedding arrays.

I am wondering if my method is wrong, and is it possible to get more details on how Control via PCA was achieved in the original paper?

Miru302 commented 1 year ago

I'm closing this issue.

My problem was that I didn't reset seeds for numpy and torch after each iteration. After fixing that I could see the distinction between styles in PCA space more clearly.

Also creating 64 principal components helped stabilize style changes when modifying a single principal component.

Interestingly, PC1 indeed correlated with hips sway but also added more hand movements. While PC2 only added more short hand gestures (I was using Still_0 and Happy_0 to see the difference). In anycase, it is in fact possible to modify style embeddings, not very controllable changes, but it's something. Which makes me rather happy.

Thank you for sharing your great work!

MengHao666 commented 1 year ago

hich makes me rather happy.

hi, I am also learning this work now? What do you think about fixing seed when training the VAE network? Does it help to reproduce the results? Whether does it hurt or benefit the training and inferencing process?

Miru302 commented 1 year ago

Hi @MengHao666 , I did not train networks, I've been inferencing provided models so I could do apples to apples comparison with the result described in paper.

I've been using v1 style_encoder model to get style vectors from animations and default parameters as in generate.py, When I ran the encoder in a loop I noticed it gave me different results than if i run it one by one (except the very first result). I concluded that it was random seed stream advancing after each iteration. Resetting it back with 1234 after each iteration helped to reproduce identical vectors to one-by-one method. I think it also helped to somewhat isolate randomness from results and that's why PCA graph was more clustered.

希望有帮助 :)

MengHao666 commented 1 year ago

Could it always produce promising and different results when the random seed is not fixed?

Miru302 commented 1 year ago

It's producing slightly different results if the seed is not fixed, but style was generally correct (tested by generating animation from those style embedding arrays, and vectors were in proximity of each other).