MasayaKawamura / MB-iSTFT-VITS

Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Apache License 2.0
417 stars 64 forks source link

16k Sample Rate Error #13

Closed maytusp closed 1 year ago

maytusp commented 1 year ago

16k Sample Rate Error >

Hi @wizardk, I have the same problem as yours. Do you know what parameters to be adjusted for 16k?

Originally posted by @maytusp in https://github.com/MasayaKawamura/MB-iSTFT-VITS/issues/7#issuecomment-1357225003

JohnHerry commented 1 year ago

Hi, maytusp, have you got a possible 16K config? I had tried to change only the sample-rage config and keeps others unchanged, but in the training process ,the evaluation step generated audio are always had bad duration prediction, the speech speed is very slow, and the total audio length of the generated is far more then that of the corresponding GT.

hildazzz commented 10 months ago

same problem here!