open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
https://openhlt.github.io/amphion/
MIT License
4.41k stars 373 forks source link

[BUG]: TTA ldm training loss #162

Closed Sreyan88 closed 5 months ago

Sreyan88 commented 5 months ago

Describe the bug

The loss for ldm (TTA) is calculated between the model output and pure noise here. Is this right? I think the loss should be calculated with the ground truth latent.

How To Reproduce

The same steps as mentioned in the repo for TTA.

Expected behavior

Mentioned above.

Screenshots

N/A

Environment Information

The same steps as mentioned in the repo for TTA.

Additional context

N/A