huawei-noah / Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
545 stars 113 forks source link

mels_mode generation #36

Open Biyani404198 opened 4 months ago

Biyani404198 commented 4 months ago

Hi, I have created TextGrid files in the subfolder textgrids using MFA. Im facing issues to get average voice mel-spectrograms in the subfolder mels_mode. Im using get_avg_mels.ipynb jupyter noteboook to get average voice mel-spectrograms. Its generating mels_mode dictionary with phonemes as keys. But there is not further instructions to map them with spakers and create mels_mode subfolder using this dictionary. @ivanvovk @ytyeung @wenyong-h @huawei-noah-admin @zhangjiajin2 Pls help.

for p in phoneme_list: mels_mode[p] = mode(np.asarray(mels_mode_dict[p]), 0).mode[0] lens[p] = np.mean(np.asarray(lens_dict[p]))