MlWoo / LPCNet

Efficient neural speech synthesis
BSD 3-Clause "New" or "Revised" License
80 stars 18 forks source link

./dump_data two different input.s16 but got same size of features.f32 data.u8. #13

Closed xm9512 closed 4 years ago

xm9512 commented 4 years ago

I want to train the LPCNet on two different datasets and follow the step to concate the .wav files to input1.s16 and input2.s16. When I dump the two .s16 pem files seperatedly, they gived me features1.f32 (about 1.1G), feature2.f32 (about 1.1G), data1.u8 (3G), data2.u8 (3G). It is strange that my two datasets has very different size, but got same size of features. I am sorry I have not looked the dump_data.c carefully. Is it normal?

xm9512 commented 4 years ago

Also, my input1.s16 and input2.s16 have very different size.

JohnHerry commented 4 years ago

You should read the dump_data.c code carefully, It will dump out at most 5000000 frames from the source raw pcm file if your pcm is not large enough. If your file is less then 5000000 frames (5000000 * 10ms = 50000s), the file will be read more then once to reach 5000000 frames, If your raw pcm file is larger then 5000000 frames [ which means the count of your pcm file frames minus the count of silence frames in it ] , then the result feature32 and data.u8 will be larger then 1.1G and 3G.

xm9512 commented 4 years ago

You should read the dump_data.c code carefully, It will dump out at most 5000000 frames from the source raw pcm file if your pcm is not large enough. If your file is less then 5000000 frames (5000000 * 10ms = 50000s), the file will be read more then once to reach 5000000 frames, If your raw pcm file is larger then 5000000 frames [ which means the count of your pcm file frames minus the count of silence frames in it ] , then the result feature32 and data.u8 will be larger then 1.1G and 3G.

Thank you very much. I get it. My pem files is only about 4/7 hours.