kaituoxu / Speech-Transformer

A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
769 stars 196 forks source link

question about "build_LFR_features" #15

Closed luweishuang closed 5 years ago

luweishuang commented 5 years ago

i found any feature the kaldi performed will pass function "build_LFR_features", is it has any special function? what's means by "stacking frames and skipping frames"?

kaituoxu commented 5 years ago

This function will generate Low Frame Rate (LFR) features. Actually, LFR features is generated by stacking and skipping FBank features or other features (like MFCC).

Paper "THE SPEECHTRANSFORMER FOR LARGE-SCALE MANDARIN CHINESE SPEECH RECOGNITION" (https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8682586) provides an example, and actually I follow this paper to implement LFR features.