152334H / tortoise-tts-fast

Fast TorToiSe inference (5x or your money back!)
GNU Affero General Public License v3.0
759 stars 177 forks source link

FileNotFoundError: [WinError 3] The system cannot find the path specified: 'tortoise/voices' #81

Open FurkanGozukara opened 1 year ago

FurkanGozukara commented 1 year ago

Hello. I am doing training with DL-Art-School

I have got these checkpoints

.pth

image

I want to use them with streamlit Webui

how do can i use them?

when i start web ui this is the error i am getting

image

the folder is there

i am using windows 10

image

Ryu1845 commented 1 year ago

Looks like you're in the tortoise-tts-fast\scripts folder, you should be in the tortoise-tts-fast folder

FurkanGozukara commented 1 year ago

Looks like you're in the tortoise-tts-fast\scripts folder, you should be in the tortoise-tts-fast folder

thanks it started

now how do i give my custom trained voice?

file name is 1480_gpt.pth

Also how can I make text splitting to a delimiter instead of character length? Such as ; split

Ryu1845 commented 1 year ago

Have you tried selecting it in the "Select GPT Checkpoint" drop-down (the one in your screenshot)

FurkanGozukara commented 1 year ago

Have you tried selecting it in the "Select GPT Checkpoint" drop-down (the one in your screenshot)

thank you so much it worked but i have few more questions

here 2 examples it generated but it took huge time on RTX 3090 - more than CLI command i think

i am asking questions because hopefully i will prepare a video tutorial on my channel https://www.youtube.com/SECourses

1 : to obtain consistency of multiple generation, like reading entire document, we only need a certain seed or any other settings?

2 : can we save advanced settings?

3 : what is Select Diffusion Checkpoint used for?

4 : how can we provide delimited text like with ; and for each text part it generates a audio?

5 : in the Voice we chose custom , make a custom folder and put our training voice there?

6 : what does conditioning free do? improves synthesis quality or speed etc?

7 : what is Latent averaging mode ? which one should we use?

8 : when we change preset the steps and other options in advanced tab still remains same is that expected?

https://user-images.githubusercontent.com/19240467/235678360-565d0e48-4b64-40ca-9bb3-937e9f5d3ae5.mp4

https://user-images.githubusercontent.com/19240467/235678366-e7e19563-5dbc-42ad-aa3c-0ed095e24ac9.mp4

Ryu1845 commented 1 year ago

TL;DR This repository is pretty much dead so don't expect improvements but,

  1. I don't know
  2. It's not implemented
  3. Like you have fine-tuned the GPT Checkpoint, you can fine-tune the diffusion part
  4. I don't think it's supported in the webui, at least I didn't implement it.
  5. I think you can, yes.
  6. It improves speed IIRC (this should be in the README)
  7. Probably the default one
  8. Not really, I think it should work though.
FurkanGozukara commented 1 year ago

TL;DR This repository is pretty much dead so don't expect improvements but,

  1. I don't know
  2. It's not implemented
  3. Like you have fine-tuned the GPT Checkpoint, you can fine-tune the diffusion part
  4. I don't think it's supported in the webui, at least I didn't implement it.
  5. I think you can, yes.
  6. It improves speed IIRC (this should be in the README)
  7. Probably the default one
  8. Not really, I think it should work though.

thanks for the replies

do you know any other project that can train voice or produce very high quality consistent pre trained voice?

also diffusion training is same as gpt training? i mean we train our model likeliness or it is used for something else?

Ryu1845 commented 1 year ago

https://git.ecker.tech/mrq/ai-voice-cloning/wiki

FurkanGozukara commented 1 year ago

https://git.ecker.tech/mrq/ai-voice-cloning/wiki

Amazingly detailed explanations thanks will check out