xiph / LPCNet

Efficient neural speech synthesis
BSD 3-Clause "New" or "Revised" License
1.12k stars 295 forks source link

It surprisingly takes around 20minutes to generate features by dump_data. #180

Open andylida opened 2 years ago

andylida commented 2 years ago

Hi there. I followed the README file and export CFLAGS to boost the dumpdata process. I use librosa to downsample LJspeech and use sox to create PCM files. It takes me around 20min to process a 7s long wav file. And also comparing to the size of my wav which is only 302kb, the processed feature sum up to a 4G files. Is it working properly? Can anyone gives some suggestion? Thx!

jmvalin commented 2 years ago

if your training file is too small, dump_data will iterate over it to generate enough augmented data. In this case though 7s is just too short, but that's why it takes a long time.