liusongxiang / efficient_tts

Pytorch implementation of "Efficienttts: an efficient and high-quality text-to-speech architecture"
MIT License
115 stars 21 forks source link

questions about code #3

Closed attitudechunfeng closed 3 years ago

attitudechunfeng commented 3 years ago

I wonder if the code in https://github.com/liusongxiang/efficient_tts/blob/d186a56bf87e2c688158179f0f41b981718aebdb/nntts/models/efficient_tts.py#L338 is correct?It seems two tensors with different size make subtraction,[B,T2,1] and [B,T1,1]

liusongxiang commented 3 years ago

imv.unsqueeze(1) has shape of [B, 1, T2] and p.unsqueeze(-1) has shape of [B, T1, 1]. The minus operation will conduct broadcast first, which means the result will have shape of [B, T1, T2].

attitudechunfeng commented 3 years ago

got it, thank you!