open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
https://openhlt.github.io/amphion/
MIT License
7.76k stars 589 forks source link

[BUG]: libritts_dataset.py of valle_v2 #260

Open CriDora opened 3 months ago

CriDora commented 3 months ago

When I write "dataset_list":["train-clean-360", "train-clean-100"] in valle_v2's exp_ar_libritts.json, and then execute "python -m models.tts.valle_v2.libritts_dataset" , the code self.trans_cache["Duration"] of train-clean-360 can be calculated correctly, but it will get stuck when calculating the self.trans_cache["Duration"] of the second dataset train-clean-100. I put the two lines of code for setting ID in libritts_dataset.py "self.metadata_cache.set_index("ID", inplace=True) self.trans_cache.set_index("ID", inplace=True)" after the calculation of duration and outside the for loop, and the above problem will not occur. Is the above problem caused by my improper operation?

jiaqili3 commented 3 months ago

Thanks @CriDora for your feedback. I think your solution is correct, we'll try to debug this issue. Thanks!