nii-yamagishilab / project-CURRENNT-public

CURRENNNT codes and scripts
GNU General Public License v3.0
77 stars 11 forks source link

How to convert Audio files. #4

Open taktak1 opened 4 years ago

taktak1 commented 4 years ago

I want to convert my own data into a data set, but I don't know how. How do I convert audio files (eg wav files)? How do you make each file below scripts/EXAMPLE/RAWDATA?

TonyWangX commented 4 years ago

Hi, these data are features extracted by different feature extractors. Depending on what you want to achieve with the model, you may prepare only a few of the data.

.mgc: Mel-generalized cepstral coefficients, extracted using SPTK (see Merlin https://github.com/CSTR-Edinburgh/merlin) .lab: binary linguistic features extracted using Festival_lite and an in-house converter. This can be done using Merlin ...

For example, If you want to train an F0 model for TTS, you need text-analyzer/aligner to prepare the input binary linguistic features (.lab) and the output F0 data (.F0). To extract .lab, you may need language specific text-analyzer such as Festival for English or OpenJtalk for Japanese. For .F0, it may be the SWIPE, SPTK, or STRAIGHT.

This CURRENNT toolkit is only for training and using neural networks, it doesn't and cannot include scripts or tools to extract all types of data. Thus, you have to know what you need and find the right tool myself. This is like the feature engineering that must be done by the practicers of deep learning themselves (sorry for that).

If it helps, please tell me what you want to do with the tool and I can answer accordingly. Otherwise, it is quite inefficient to list all the external tools for feature extraction