Creating norm_info_mgc_lf0_vuv_bap_63_MVN.dat for the Full VCTK dataset

facebookarchive / loop

A method to generate speech across multiple speakers

Other

871 stars 158 forks source link

Creating norm_info_mgc_lf0_vuv_bap_63_MVN.dat for the Full VCTK dataset #49

Open PetrochukM opened 6 years ago

PetrochukM commented 6 years ago

Hi There!

For large datasets, where extract_feats.py uses it's multifolder feature like the full VCTK dataset; it's unclear what the norm_info/norm.dat file is. The norm_info_mgc_lf0_vuv_bap_63_MVN.dat file is regenerated for each tmp split of the dataset. How do you create the norm_info/norm.dat for datasets with more than 5000 files?

I believe you had to deal with the same problem with the 22 speaker dataset because it contains around 8000 files.

Thanks for your time, Michael. Happy to contribute back the findings.

P.S. I've been commenting in https://gist.github.com/kastnerkyle/cc0ac48d34860c5bb3f9112f4d9a0300 about changes needed to make the extract_feats.py script work. I can't submit a pull request. I know many people are struggling to get it running.

niravpatel2008 commented 6 years ago

@PetrochukM i also have large dataset for single speaker. it around 13k files. can you please share you changes so i can apply to my side.

macarbonneau commented 6 years ago

This post will answer your question: #11

PetrochukM commented 6 years ago

@macarbonneau Have you run this? Is it OKAY to just recompute the mean and the STD?

It looks like the data is recreated in the generation code using the mean and the STD; therefore, every numpy features (.npz) file needs to be updated to fit the new mean and the new STD. I want to make sure i'm understanding this correctly.

macarbonneau commented 6 years ago

This sounds very complicated.

Why not just put all your .wav files in the same folder and extract npz feature on this folder. Anyway, you will have to combine them in a numpy_features/ and a numpy_features_valid/ folder later. Voiceloop uses the prefix in your file name to determine speaker ID.

This is what I did and it works fine.

PetrochukM commented 6 years ago

They are in the same folder.

When I run extract_feats with 40,000 files, it splits them up because its over a 5,000 file limit here: https://gist.github.com/kastnerkyle/cc0ac48d34860c5bb3f9112f4d9a0300#file-extract_feats-py-L1339

Did you disable this feature?

macarbonneau commented 6 years ago

Oh yeah... I played a bit with the script. I don't think this part is active anymore in my code.

PetrochukM commented 6 years ago

@macarbonneau O nice. Can you post a gist?