keonlee9420 / PortaSpeech

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
MIT License
331 stars 36 forks source link

The meaning of inputs[11:] in model.loss.py #21

Open Dyongh613 opened 2 years ago

Dyongh613 commented 2 years ago
HI@[keonlee9420],I cannot understand the meaning of inputs[11:] in model.loss.py

def forward(self, inputs, predictions, step): ( meltargets, *, ) = inputs[11:] Thank you very much!

Dyongh613 commented 2 years ago

On inputs[11:], how to decide the value(11). Thank you!

manhph2211 commented 2 years ago

For me, I think you shoud observe the definition of a batch in module dataset.py

Dyongh613 commented 2 years ago

Thank you! I still cannot understand the batch in dataset.py for you said. duration_targets in inputs[3:] is tensor(8,38), while the log_duration_predictions in predictors is tensor(8,19) which got from linguistic_encoder module.  Then there will be error in get_duration_loss(self, dur_pred, dur_gt). Sorry, my English is not well!  def forward(self, model, inputs, predictions, step, coarsemels=None, Ds=None): ( texts, , , , , , _, meltargets, , , , durationtargets, , ) = inputs[3:] ( melpredictions, , , , log_duration_predictions, duration_roundeds, src_masks, mel_masks, src_lens, mel_lens, alignments, dist_info, src_w_masks, alignment_logprobs, postnet_output, ) = predictions ------------------ 原始邮件 ------------------ 发件人: "keonlee9420/PortaSpeech" @.>; 发送时间: 2022年5月5日(星期四) 中午12:23 @.>; 抄送: "Rui @.**@.>; 主题: Re: [keonlee9420/PortaSpeech] The meaning of inputs[11:] in model.loss.py (Issue #21)

For me, I think you shoud observe the definition of a batch in module dataset.py

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Dyongh613 commented 2 years ago

log_duration_predictions and duration_roundeds in Predictors , tensor(8,19) , both calculate from Linguistic_encoder. The duration_targets in Inputs[3:] tensor with (8,38). I see there will be some errors when I calculate the duration_loss in the same method with PortaSpeech, because the inputs[3:] duration_targets is tensor(8,38).

------------------ 原始邮件 ------------------ 发件人: "keonlee9420/PortaSpeech" @.>; 发送时间: 2022年5月5日(星期四) 中午12:23 @.>; 抄送: "Rui @.**@.>; 主题: Re: [keonlee9420/PortaSpeech] The meaning of inputs[11:] in model.loss.py (Issue #21)

For me, I think you shoud observe the definition of a batch in module dataset.py

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>