Two questions about `smooth_loss` in `audio2headpose_model`

DreamtaleCore commented 2 years ago

Hi, I'm trying to repeat the training part of audio2headpose these days. I have two questions about the implementation.

Is mu_gen=Sample_GMM ... (Line-103) in audio2headpose_model benefit to the performance? Besides, I have found 'We also tried with a Gaussian Mixture Model but found no obvious improvement' in the paper, but I am a little confused. Are those the same thing? It seems the implementation of Eq(8) is the Sample_GMM function (please correct me if I am wrong).
The computational efficiency of Sample_GMM is rather low. When using it (set smooth_loss > 0), it needs ~2h for one epoch. I find that there are too many for-loops (line-99) and CPU operation. Are there other alternatives?

YuanxunLu commented 2 years ago

Training the audio2headpose module using smooth loss is one of my history experiments. Actually, I didn't use this smooth loss finally, where I only use the probabilistic loss as demonstrated in the paper. I think this smooth loss doesn't work obviously as I remember, so it is deprecated.

Sorry that I didn't clean up the training-related codes clearly and they confuse your training.

Gaussian Mixture Model is just a multi-gaussian version loss, and I just use one gaussian so it degrades to a single Gaussian distribution. I describe it in the code comments on in GMM loss function.

I didn't know any alternative to this loss and this is my implementation. You can check it on the internet and write your own version to speed it up.

DreamtaleCore commented 2 years ago

Thanks!

YuanxunLu / LiveSpeechPortraits

Two questions about `smooth_loss` in `audio2headpose_model` #54