deepspeech2提取音频特征

PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

https://paddlespeech.readthedocs.io

Apache License 2.0

10.66k stars 1.81k forks source link

deepspeech2提取音频特征 #1939

Closed qiuyuzhao closed 2 years ago

qiuyuzhao commented 2 years ago

请问deepspeech2提取音频特征在代码中是encoder出来的就是音频特征了么？decoder是吧提取到的音频特征和解码成文字么？

zh794390558 commented 2 years ago

encoder输出的是hidden state.

qiuyuzhao commented 2 years ago

请问在模型Conv layer， Recurrent layer ，和FC layer中哪里输出的是音频的特征？

zh794390558 commented 2 years ago

音频特征是conv的输入