pannous / tensorflow-speech-recognition

🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
Other
2.16k stars 639 forks source link

Getting some ideas from Wavenet? #8

Open grapemix opened 7 years ago

grapemix commented 7 years ago

Since this project is still in planning stage, I guess we are more open for new ideas. The README mentioned the LSTM, but Wavenet yields better results than LSTM accoring to DeepMind's paper. The Wavenet is explained in the following white paper. Do you think it will be too difficult for us to use the Wavenet approach?

https://drive.google.com/file/d/0B3cxcnOkPx9AeWpLVXhkTDJINDQ/view

Thanks.

andrenatal commented 7 years ago

Somebody did it already: https://github.com/ibab/tensorflow-wavenet

thomasmurphycodes commented 7 years ago

@andrenatal That implementation doesn't do STT though right? It's an implementation of the generative material stated in the whitepaper I believe.

grapemix commented 7 years ago

@andrenatal , ty so much. That repo sounds really very interesting. I am not sure if anyone further discuss this direction. Since it is suggestion ticket, I will still leave this ticket open, but if anyone think the discussion is enough, feel free to close this ticket. And thanks all for your time.

pannous commented 7 years ago

1-d dilated/atrous convolution is the way to go ...