keonlee9420 / PortaSpeech

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
MIT License
331 stars 36 forks source link

Who can share the pre-trained model which is the AISHELL3 #18

Open Dyongh613 opened 2 years ago

Dyongh613 commented 2 years ago

Who can share the pre-trained model which is the AISHELL3

keonlee9420 commented 2 years ago

Great suggestion @qw1260497397 ! I'm not familiar with Chinese, so I hope someone can apply the AISHELL3 dataset and share the results for the community.

Dyongh613 commented 2 years ago

After training 5000 times with aishell3, an error is reported.
File "D:\项目\PortaSpeech-main\model\linguistic_encoder.py", line 222, in forward duration_w_rounded, src_w_len, mel_mask)) File "D:\项目\PortaSpeech-main\model\linguistic_encoder.py", line 140, in add_position_enc pos_enc = coef.unsqueeze(-1) * pos_enc RuntimeError: The size of tensor a (1298) must match the size of tensor b (1001) at non-singleton dimension 1

keonlee9420 commented 2 years ago

I see. I think you have to update max_seq_len in model.yaml so that it has the value greater than that of preprocessed_data/AISHELL3/stats.json. For example, it was 870 in LJSpeech so I set max_seq_len as 1000.

Dyongh613 commented 2 years ago

Thank you for your reply! Just now I modified max seq len as 1428. Can I ask you some questions later? I'm still in the first year of graduate school

------------------ 原始邮件 ------------------ 发件人: "keonlee9420/PortaSpeech" @.>; 发送时间: 2022年4月3日(星期天) 上午10:36 @.>; 抄送: "Rui @.**@.>; 主题: Re: [keonlee9420/PortaSpeech] Who can share the pre-trained model which is the AISHELL3 (Issue #18)

I see. I think you have to update max_seq_len in model.yaml so that it has the value greater than that of preprocessed_data/AISHELL3/stats.json. For example, it was 870 in LJSpeech so I set max_seq_len as 1000.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>