Suggestion and question.

PlayVoice / whisper-vits-svc

Core Engine of Singing Voice Conversion & Singing Voice Clone

https://huggingface.co/spaces/maxmax20160403/sovits5.0

MIT License

2.6k stars 919 forks source link

Suggestion and question. #89

Closed Turokirill closed 1 year ago

Turokirill commented 1 year ago

Hello, you can add the function of running 1 file in different "pitch". For example, I process 1 file and get 4 audio files with different "pitch" output. And so that you can choose the "pitch" range from -5 to -2, for example (the same 4 files). To quickly understand which "pitch" is more suitable. And also the question, what is the best learning rate to set? p.s And do I need to choose a language for the whisper model somewhere? Or does he define it himself?

MaxMax2016 commented 1 year ago

whisper large is multilingual model, whisper audio encoder using in this project can be treated as language indepnedent.

what is the best learning rate to set? There is no best setting for learning rate.

To quickly understand which "pitch" is more suitable. I can try it.