Plachtaa / VITS-fast-fine-tuning

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
Apache License 2.0
4.74k stars 712 forks source link

Previous Colab Notebook #29

Open lopezjuanma96 opened 1 year ago

lopezjuanma96 commented 1 year ago

Hello!

Yesterday I tested the Colab notebook that included an interface to record your own voice clips and fine tune with that, and did not have the video part. I was starting to build a fork of the repo to repeat the process in spanish but it seems the notebook was updated along with these new video features, is there a chance I can still access to that old notebook, to adapt it for my fork?

Of course as soon as I manage to adapt it I can provide you with the changes I made and try to adapt it so that you can have an extra language option.

Thanks in advance, and congratulations on such an amazing work, I had already contacted you from the hugging face repo but the more and more I look at your work I am more amazed.

Greeting from Argentina, Juanma

PS: btw, my fork of the repo is here, I just started it today so it has almost nothing changed but you can take a look at the ToDo list and give me your feedback.. Otherwise I will still contact you as soon as I have something solid working on spanish.

Plachtaa commented 1 year ago

Hi, thanks for your effort on building Spanish function! The new notebook requires less effort in data preprocessing. For recording your own voice, simply read anything and whisper will do auto segment & annotation. It works for all languages including Spanish. However, if you prefer to work on the old repository, you can find the old repo here, it should be compatible with the old notebook Good luck to your research!

lopezjuanma96 commented 1 year ago

Awesome! Yeah I don't know why it did not occur to me that I could go back some commits to find it. But then again as you say I will test the new version out first and go back to the old one if I can't ge to do it there.

I will let you know if I find out anything.

lopezjuanma96 commented 1 year ago

Hi again, I've been taking a look at both old and new repos, I believe I will work with the old one since it has the voice_collect script that makes it much simpler to visualize and follow, then as soon as I make some progress I'll try to update it to the newest version.

ov1n commented 1 year ago

@Plachtaa thanks for this. After some tireless effort with Tacotron 2 for speaker adaptation I will also fork this to test the possibility of adding Sinhala Language support (Sri Lanka) @lopezjuanma96 can I fork your base repo because of the language barrier with Japanese in order to make Sinhala text cleaner functions?

lopezjuanma96 commented 1 year ago

Hi @ov1n , of course! I've managed to do a bunch of modifications but had to put the project on pause for the week so I could not finish implementing the final steps which is uploading a proper pretrained spanish model, but everything else seems to be working. Check out the README and comments on the code for more info or contact me directly!

Also tell me as soon as you ge results so that we can share some dos and donts.

lopezjuanma96 commented 1 year ago

Hi @Plachtaa ! I've got pretty far on training the spanish model but I have a few doubts that I wanted to ask you about. If there's some way we can get in touch on a voice call that would be awesome, but since we might have some issues with timezones differences, if we can't I can use some time on the weekend to type a proper comment asking all the questions I have and post it here!

Hope everything's great over there, I'll await for your answer on the voice call, Juanma.

PS: btw, I'm on timezone GMT-03 here.

Plachtaa commented 1 year ago

Hi @Plachtaa ! I've got pretty far on training the spanish model but I have a few doubts that I wanted to ask you about. If there's some way we can get in touch on a voice call that would be awesome, but since we might have some issues with timezones differences, if we can't I can use some time on the weekend to type a proper comment asking all the questions I have and post it here!

Hope everything's great over there, I'll await for your answer on the voice call, Juanma.

PS: btw, I'm on timezone GMT-03 here.

Hi @lopezjuanma96 It's good to hear that you have got some progress on training the Spanish model. If you have any questions to ask me, I'm glad to arrange a voice call with you. I'll be free during 2pm to 11pm this Sunday, which will be 3am to 12pm on Saturday in your timezone. If you find this OK with you, we can proceed to prepare for the voice call.

lopezjuanma96 commented 1 year ago

That's awesome, if I understand correctly we could connect on my Sunday morning which would be your Sunday evening right? What about Sunday between my 9-10 am (I believe that's about 8-9 pm for you).. If thats okey by you I'll create the Calendar event right away

Plachtaa commented 1 year ago

Oh sorry, it should be Sunday morning for you. Sure, let's settle it this way for now.

lopezjuanma96 commented 1 year ago

Hi! Just read this! I'm up if you still have time, this is my Discord Tag:

zagador123#2472

if not we can schedule it for later no problem

Plachtaa commented 1 year ago

Hi! Just read this! I'm up if you still have time, this is my Discord Tag:

zagador123#2472

if not we can schedule it for later no problem

Sent you a friend request. Mine is Welkin#6255