Open dribnet opened 7 years ago
Hey! Thanks for your interest.
Unfortunately, some of the datasets are not available publicly (like blizzard). For the others, we plan to release a preprocessed version so people can use them. We have a rough series of instructions to preprocess but it requires installing quite a few libraries so I'm not sure that you'd like to go that way. Let me know what would you prefer.
Right now, we are working on finishing our ICML submission. After this, (probably this weekend,) we will be more free to shape up the code and data. Also, we should release some pretrained models for everyone to explore.
i want to know how to train with utf8 processing and how to train Arctic data .
@sotelo, is this the way you preprocess the data for this project?: https://github.com/sotelo/world.py
Thank you for your work!!
@Zeta36 Hola! No, it's not like that. We will describe how we do it soon. With the ICML deadline coming, we are finishing the paper but should be ready to help others with replication afterwards.
Hi @sotelo this is great, where can I find your email would definitely like to keep up with your progress.
Hello, @sotelo. Any news about your project? (I 've not seen any update in your website from a time now)
By the way, you said to @dribnet: "Unfortunately, some of the datasets are not available publicly (like blizzard). For the others, we plan to release a preprocessed version so people can use them."
Are you finally going to release this or explain al least how to do this preprocces step?
Thanks a lot!!
Hello, @sotelo.
"So, we're currently in the process of doing this. It's a bit messy because the data processing requires installing a few C libraries. Now, we're deliberating whether we should proceed with wrappers (basically updating my old world.py repo) or we just should point people to the instructions on how to do the processing themselves."
It would be wonderful to have any of the two posibilites. No hurry anyway, we will be waiting :).
Regards!!
Hello, @sotelo. My data shapes printed by datasets.py: features shape: (1001, 500, 67) features dtype: float64 features_mask shape: (1001, 500) features_mask dtype: float64 labels shape: (500, 1914) labels dtype: int32 labels_mask shape: (500, 1914) labels_mask dtype: float64
It's correct?(seq_size is 1000, batch_size is 500,feature_dim is 67)
Why labels and labels_mask not do _transpose?
Regards!!
@sotelo The features include 60 mgc,5 bap,1 lf0 and 1 v/u, 5 ms / frame. labels are pure phones index seqs(label_type use unaligned_phoneme in code).not use raw_audio process now. audios are a bit long,one includes 1 to 5 sentences,I will split to 1 sentence latter,is it necessary?
Your seq_size=50, means to 250ms? so short. And I can't find SegmentSequence process for unaligned_phonemes in your codes, why? I just want to train mapping from unaligned_phonemes to vocoder features, What should I do?
Thanks!
@sotelo
How can I understand them:'full_labels', 'phonemes', 'unaligned_phonemes', 'text'. Thanks!
Yes, I using unaligned phonemes and have trained one model, MSE from 150 to 6.1, but the sound over vocoder is not right, I'am checking...
Any updates on this? I wanted to do some experiments with VCTK, but couldn't figure out how to preprocess the data.
Hey, @sotelo is there a way to use parrrot for finding phoneme boundaries? I am working on concatenative synthesis and it would be a very nice feature to have.
For the others, we plan to release a preprocessed version so people can use them. We have a rough series of instructions to preprocess but it requires installing quite a few libraries so I'm not sure that you'd like to go that way. Let me know what would you prefer.
Hey @sotelo, is there any new information regarding the preprocessed version or could you upload/point to the mentioned rough series of instructions? Would be very helpful.
Thanks in advance!
The referenced fuel datasets
['arctic', 'blizzard', 'dimex', 'librispeech', 'pavoque', 'vctk']
are not in the fuel distribution. Are there standard converters for any of these already in other projects?