The feature extraction technique (or the input to the neural network model) used in this implementation?

SeanNaren / deepspeech.pytorch

Speech Recognition using DeepSpeech2.

MIT License

2.1k stars 621 forks source link

The feature extraction technique (or the input to the neural network model) used in this implementation? #655

Closed Xinghui-Wu closed 3 years ago

Xinghui-Wu commented 3 years ago

The papers with regard to DeepSpeech 1 and 2 from Baidu claimed that they used the spectrogram of power normalized audio clips as the features to the system. The implementation of Mozilla DeepSpeech, based on TensorFlow, said that they turned to apply the MFCC features. Now I am curious about that of your implementation for my research purpose on the comparison of various feature extraction algorithms. So, I will appreciate it if you could solve my puzzle. Thank you.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.