Zain-Jiang / Speech-Editing-Toolkit

It's a repository for implementations of neural speech editing algorithms.
174 stars 19 forks source link

How to process and inference? #6

Closed Linghuxc closed 9 months ago

Linghuxc commented 10 months ago

Hi, I used the pre-trained model for reasoning and found that mfa_model.zip and mfa_dict.txt were missing. I downloaded the relevant models from the official mfa and created folders by myself to put them in.

However, the output shows: image

Do I need to perform the data processing part first? After entering the following command: image show: image

How should I solve this problem and I need help with!

Zain-Jiang commented 10 months ago

I'm sorry that I made mistakes in recent updates. Thank you for pointing out this problem. I have pushed a new readme file to guide the preprocess process and updated the corresponding codes.

Linghuxc commented 10 months ago

Hi, I am trying to train fluentspeech with LibriTTS,.Could you tell me what subset of LibriTTS do you use?

Zain-Jiang commented 10 months ago

I used train-clean-100 + train-clean-300.

Linghuxc commented 10 months ago

Maybe train clean 100 and train clean 360? image Do you train one subset first and then train the next? Or do you mix two subsets together for training?

Zain-Jiang commented 10 months ago

yes, train clean 100 and train clean 360. I just mixed two subsets together for training in the experiments of our paper. But in my later experiments, mixing three subsets together shows better zero-shot capabilities.

Linghuxc commented 10 months ago

I have now pre-processed train_clean_100. Should I mix the two subsets again and preprocess them again or can I just process train_clean_360 on the current basis?

Zain-Jiang commented 10 months ago

The code does not support dataset concatenation. Sorry that you need to mix the two subsets again and preprocess them again.

Linghuxc commented 10 months ago

Ok, thanks for your reply!

Zain-Jiang commented 10 months ago

You're welcome! If you have any more questions or need further assistance, feel free to ask.

Linghuxc commented 10 months ago

Hi, I'm training the model on libritts and vctk, Use this command CUDA_VISIBLE_DEVICES=0 python tasks/run.py --config egs/spec_denoiser.yaml --exp_name spec_denoiser --reset I have a question.

I trained vctk using egs/spec_denoiser.yaml, but should I train libritts using egs/spec_denoiser.yaml or egs/spec_denoiser_libritts.yaml?

In addition, I have a friendly reminder, you can update the readme, python data_gen/tts/run_mfa_train_aligh.shmay need to change h to n, and then use bash to execute.

Zain-Jiang commented 10 months ago

I'm sorry that the egs/spec_denoiser_libritts.yaml is out-dated. You can use directly use egs/spec_denoiser.yaml to train the model on libritts.

Thanks for your advice! I will update the readme.