open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
https://openhlt.github.io/amphion/
MIT License
4.28k stars 364 forks source link

[Help]: Natural Speech 2 training issue(Data loader) #232

Open CreepJoye opened 1 week ago

CreepJoye commented 1 week ago

During training, I printed the data output from the dataset, NS2Collator, and DataLoader separately. Only the DataLoader output data showed anomalies (some tensors were all zeros). The duration data is loaded through the dataset, then processed and padded by the NS2Collator according to the specified batch size. The NS2Collator is used as the collate_fn parameter of the DataLoader in ns2_trainer.py. I am not sure how the DataLoader processes the data output by the NS2Collator and why the DataLoader output is inconsistent with the NS2Collator output.

image image image