Closed WeicongChen closed 4 years ago
In the fully_pythonic
branch, we don't use MFCC features to represent the audio. Instead, we use Mel-spectrograms which has 80 features at each timestep. Please use the master
branch if you want to use the exact model used in the paper.
Hi, great work! I am a little confused with the MFCC feature size. In the paper, you said
However, in the
audio_hparams.py
, I found thenum_mels
equls to 80 but not 13, which is different from the paper's claim. Can you kindly explain the difference for me?