m-toman / tacorn

2018/2019 TTS framework integrating state of the art open source methods
MIT License
47 stars 4 forks source link

Original WaveRNN implementation #8

Open m-toman opened 6 years ago

m-toman commented 6 years ago

Would be great to try out the original WaveRNN implementation with 16 bit quantization. Either from here https://github.com/fatchord/WaveRNN/blob/master/models/wavernn.py or some other/own implementation.

Yeongtae commented 6 years ago

Good. It looks like using operation optimization technique to reduce matrix multiplication.

bjtommychen commented 6 years ago

Good. waiting for your news.

m-toman commented 6 years ago

Branched out the alternative model by fatchord into https://github.com/m-toman/tacorn/tree/fatchord_model

And started a new branch https://github.com/m-toman/tacorn/tree/wavernn where I added the original wavernn implementation (also by fatchord). Seems it mostly misses the conditioning on mel spectrograms and the a bit of reworking the training procedure.

hdmjdp commented 5 years ago

@m-toman have you tried implement the mel condition part in https://github.com/m-toman/tacorn/tree/wavernn?

m-toman commented 5 years ago

@hdmjdp not yet, I'm currently reworking the framework itself (#13) to allow faster experimentation while watching the progress in https://github.com/erogol/WaveRNN and see if I can merge it with the status here.

m-toman commented 5 years ago

Implemented the model from https://arxiv.org/abs/1811.06292 I'm currently seeing the same issues as with the alternative model when using "bits" input type and 10 bits: training from GTA Mel specs produces noisy spikes. Training longer..