Closed r9y9 closed 6 years ago
As far as I tried Tacotron https://github.com/r9y9/tacotron_pytorch, I don't think there's nothing probably we should add support for it. Dataset utility was enough to me. https://github.com/r9y9/tacotron_pytorch/blob/44922c082e10f0789b81adc33f0fed67bce3f590/train.py#L69-L108
Utility to create padded batch might be useful to have? https://github.com/r9y9/tacotron_pytorch/blob/44922c082e10f0789b81adc33f0fed67bce3f590/train.py#L124-L147
I think I can close this
Currently I only tested frame-wise training and sequence-wise training with exactly time-aligned linguistic/acoustic frame features, but we can use seq2seq models for roughly aligned ones. Sequence-wise training with non-aligned dataset, which I haven't tested yet, might reveal library design weakness, so I should try it as soon as possible.