Closed un-lock-me closed 5 years ago
Hi, It depends on what you mean by "feature extraction". The last layer of an LSTM (which sums up the entire input sequence), or even the attention matrix can be used as features for auxiliary tasks.
If you are talking about low-level audio features, it turns out that the speech-to-text models implemented here take as input pre-computed features (e.g., MFCCs), so they would not be very useful for this purpose.
Hi
thanks for sharing your code with us. Is this model good for feature extraction or it has been designed only for the translation purpose?
Thanks:)