Executedone / Chinese-FastSpeech2

基于标贝数据继续训练,同时对原本的FastSpeech2模型做了改进,引入了韵律表征以及韵律预测模块,使中文发音更生动且富有节奏
243 stars 40 forks source link

The bert details #24

Closed mondorysix closed 4 months ago

mondorysix commented 4 months ago

Thank you for sharing your work. I am truly impressed by your project and have developed a keen interest in understanding it more deeply. If it's convenient for you, I have a few questions that I'd like to ask. I noticed that you've used BERT for extracting prosodic features in your project. I've conducted some experiments on my own, but the BERT models I found on HuggingFace didn't yield results as good or as natural as yours. I've tried the WWM version and the large models, but neither seemed to work very well. This has been a point of confusion for me, and I was hoping you could help clarify. Is the BERT model you used trained from yourself or taken from Google? Is it the WWM version and did you do some modification? Also, have you fine-tuned it on datasets other than Chinese Wikipedia? I would greatly appreciate your insights on these matters.

Executedone commented 4 months ago

try this model: https://github.com/ymcui/Chinese-BERT-wwm, model_name: RoBERTa-wwm-ext, Chinese

mondorysix commented 4 months ago

thanks