Closed Miru302 closed 1 year ago
I'm closing this issue.
My problem was that I didn't reset seeds for numpy and torch after each iteration. After fixing that I could see the distinction between styles in PCA space more clearly.
Also creating 64 principal components helped stabilize style changes when modifying a single principal component.
Interestingly, PC1 indeed correlated with hips sway but also added more hand movements. While PC2 only added more short hand gestures (I was using Still_0 and Happy_0 to see the difference). In anycase, it is in fact possible to modify style embeddings, not very controllable changes, but it's something. Which makes me rather happy.
Thank you for sharing your great work!
hich makes me rather happy.
hi, I am also learning this work now? What do you think about fixing seed when training the VAE network? Does it help to reproduce the results? Whether does it hurt or benefit the training and inferencing process?
Hi @MengHao666 , I did not train networks, I've been inferencing provided models so I could do apples to apples comparison with the result described in paper.
I've been using v1 style_encoder
model to get style vectors from animations and default parameters as in generate.py
,
When I ran the encoder in a loop I noticed it gave me different results than if i run it one by one (except the very first result). I concluded that it was random seed stream advancing after each iteration. Resetting it back with 1234
after each iteration helped to reproduce identical vectors to one-by-one method. I think it also helped to somewhat isolate randomness from results and that's why PCA graph was more clustered.
希望有帮助 :)
Could it always produce promising and different results when the random seed is not fixed?
It's producing slightly different results if the seed is not fixed, but style was generally correct (tested by generating animation from those style embedding arrays, and vectors were in proximity of each other).
In the paper, it's says it's possible to achieve a level of meaningful control over final animation style by:
I've computed style encodings for each animation in zeggs dataset (134
.bvh
files in./data/clean
folder, using original code (seed 1234)). I've computed PCA withn_components=2
for resulting dataset usingsklearn.decomposition.PCA
andpca.fit_transform
While plotting results I noticed that clustering of points, while still noticeable, was not that pronounced as in the paper. (I also noticed that PC1 and PC2 were merely capturing 0.08 of variance combined, hinting that encoder is adept at evenly spreading information)By converting an arbitrary point from PCA space back into style embedding I've been getting mixed results. Sometimes it did resemble nearby points, but sometimes it was noticeably different even for the same point from dataset. I'm guessing that loss of information during dimentionality reduction was too big to reliably reconstruct embedding arrays.
I am wondering if my method is wrong, and is it possible to get more details on how Control via PCA was achieved in the original paper?