Zain-Jiang / Speech-Editing-Toolkit

It's a repository for implementations of neural speech editing algorithms.
187 stars 19 forks source link

Broken link in README #19

Open alexdemartos opened 8 months ago

alexdemartos commented 8 months ago

Hi, the following link provided in the README file seems broken:

https://drive.google.com/drive/folders/1H-dk7cNYVn1DSzYq_q66rS5b5xpbdBi4?usp=sharing

Where could we find the data/binary/libritts/phone_set.json file for the pretrained models if not there?

Thanks in advance.

2811668688 commented 8 months ago

can you solve this problem now?

Zain-Jiang commented 8 months ago

@alexdemartos @2811668688 We are sorry that the previous link are deleted by mistake. We have update the new link for the README file, the new link is: https://drive.google.com/drive/folders/1BOFQ0j2j6nsPqfUlG8ot9I-xvNGmwgPK?usp=sharing. Thanks for your comments.

alexdemartos commented 8 months ago

Thank you so much for the quick response! Actually, I managed to create these files by running the base_preprocessing on the LibriTTS dataset.

If I may, I'm trying to figure out what region indexes to use for the following edits:

4,1,"this is a libri vox recording","this is some longer sentence than the original.",inference/audio_backup/1.wav,"[3,6]","[3,8]"

Is that correct? The generated audio only contains up to the word "than".

Thanks in advance.

Zain-Jiang commented 8 months ago

Thank you so much for the quick response! Actually, I managed to create these files by running the base_preprocessing on the LibriTTS dataset.

If I may, I'm trying to figure out what region indexes to use for the following edits:

4,1,"this is a libri vox recording","this is some longer sentence than the original.",inference/audio_backup/1.wav,"[3,6]","[3,8]"

Is that correct? The generated audio only contains up to the word "than".

Thanks in advance.

I'm sorry that the definition of region and edited_region is not very clear. Perhaps changing "[3,6]","[3,8]" to "[3,8]","[3,6]" would solve this issue.