Closed bernardohenz closed 4 years ago
We have made a conscious choice to use a single LSTM layer; we want the model to be as lightweight as possible to allow targeting as many devices as possible, despite some of the devices being resource constrained.
However, that said if you want to discuss your multi-layer LSTM experiments we encourage and invite your input on our discourse form as we'd be very much interested in your results.
I've read some recent papers for speech recognition [1,2], and I noticed that they all tend to use more than a single LSTM layer. [1] Park et al., Fully Neural Network Based Speech Recognition on Mobile and Embedded Devices. NeurIPS 2018. [2] He et al., Streaming End-to-end Speech Recognition For Mobile Devices. arXiv:1811.0662, 2018.
I tried to implement it by myself, and it seems to be working for training/evaluation/exporting. Unfortunately, I need some help about what changes should I make on the binaries.
The following patch holds the implementation of arbitrary number of LSTMs (haven't created a PR because the binaries were not touched). patch_more_LSTMs.patch.txt
I am trying to apply this patch. Can you please tell for which tag this tag is for ?
@cahuja1992 I strongly recommend you to take a look into the cudnnrnn branch. It uses the cudnnLSTM/GRU that, besides being faster, allows you to easily set the number of RNN layers (see the doc).
My patch is way old, and I couldn't get any improvement with more layers (maybe I was implementing it wrongly)
Now that we've merged TensorFlow 1.14 support I plan to merge the CuDNN RNN support into master.
CuDNN RNN support is in master.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
I've read some recent papers for speech recognition [1,2], and I noticed that they all tend to use more than a single LSTM layer. [1] Park et al., Fully Neural Network Based Speech Recognition on Mobile and Embedded Devices. NeurIPS 2018. [2] He et al., Streaming End-to-end Speech Recognition For Mobile Devices. arXiv:1811.0662, 2018.
I tried to implement it by myself, and it seems to be working for training/evaluation/exporting. Unfortunately, I need some help about what changes should I make on the binaries.
The following patch holds the implementation of arbitrary number of LSTMs (haven't created a PR because the binaries were not touched). patch_more_LSTMs.patch.txt