Closed ghost closed 7 years ago
Yep. Every other time step is ignoring when doing the convolutions. You can read more about convolutional layers here: http://cs231n.github.io/convolutional-networks/#conv
Ok that part of the code is implemented in audio.py . What is tough to digest is the analogy to convolution here since in convolution,stride size is how much we shift the filter in each step across space.
Sorry, I'm currently working on the Deep Speech 2 model and just assumed that's what you were talking about. On current master we implement something like a convolution stride by simply dropping every other time step from the input, which is the code you saw in audio.py. In DS2 we have actual convolutional layers.
Is there an active github link to DS2 or DS1 code is being slowly transformed to DS2 ?
The WIP branch is here, but I make no stability or functionality guarantees: https://github.com/mozilla/DeepSpeech/tree/ds2-v2
Great. Would love to see fast progress out there. Meanwhile, I will implement the convolution part and report you the stats :)
I'm gonna close this issue as the question seems to have been answered.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Can someone give me an intuitive explanation of stride concept? If stride =2, will rnn skip through one step at a time for the features generated from mfcc while unrolling?