NVIDIA / flowtron

Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer
https://nv-adlr.github.io/Flowtron
Apache License 2.0
887 stars 177 forks source link

Plz, check the range in flip mel #67

Closed jhjungCode closed 4 years ago

jhjungCode commented 4 years ago

I think correct range is from 0 to mel.size(1), but code starts from 1.

    # backwards flow, send padded zeros back to end
    for k in range(1, mel.size(1)):
        mel[:, k] = mel[:, k].roll(out_lens[k].item(), dims=0)
rafaelvalle commented 4 years ago

The mel on the 0-th index has max_len, hence no padded values.

jhjungCode commented 4 years ago

but, input data is sorted by text length, not mel length.

most mel length increase in proportion to text length. but in some case, mel length of longest text length is not longest.

so, the mel on 0-th index has padded values.

rafaelvalle commented 4 years ago

Yes, that certainly can happen. We've updated the repo.