choijeongsoo / lip2speech-unit

[Interspeech 2023] Intelligible Lip-to-Speech Synthesis with Speech Units
19 stars 1 forks source link

Loss and Loss weight #3

Open blackbird-fish opened 2 months ago

blackbird-fish commented 2 months ago

Thanks for your great work. I am confused about the method of calculating the cross entropy loss: sum or mean. Can you tell me how to estimate the weights of mel and unit cross entropy loss ?

choijeongsoo commented 2 months ago

If I remember correctly, the method is basically 'sum'.

We experimented with several weights for mel loss and found that the weight did not significantly impact performance. We set 10 for mel loss and 1 for unit loss, respectively, to balance the scale of two losses.

blackbird-fish commented 2 months ago

Thanks for your answer. I am also not clear about the unit training method. Using AR(Auto regressive) NAR method. I found many language model used AR method. I guess your paper's method is NAR.

choijeongsoo commented 2 months ago

That's right. We used NAR method, however, the generation code is a little confusing.