152334H / tortoise-tts-fast

Fast TorToiSe inference (5x or your money back!)
GNU Affero General Public License v3.0
771 stars 179 forks source link

read.py in Streamlit #16

Closed rbychn closed 1 year ago

rbychn commented 1 year ago

I know this is very new but would be great to have support for large form text input from a file in the UI

152334H commented 1 year ago

Tentatively done, although I'm not sure if it's working perfectly

rbychn commented 1 year ago

Tried it but something broke

Traceback (most recent call last): File "C:\ProgramData\Anaconda3\envs\tts-fast\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 565, in _run_script exec(code, module.__dict__) File "app.py", line 13, in <module> from scripts.inference import run_and_save_tts, infer_on_texts, split_and_recombine_text File "C:\Users\Administrator\Desktop\tortoise-tts-fast\scripts\inference.py", line 13, in <module> def parse_voice_str(voice_str: str, all_voices: list[str]): TypeError: 'type' object is not subscriptable 2023-02-15 15:23:33.478 Uncaught app exception Traceback (most recent call last): File "C:\ProgramData\Anaconda3\envs\tts-fast\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 565, in _run_script exec(code, module.__dict__) File "C:\Users\Administrator\Desktop\tortoise-tts-fast\app.py", line 13, in <module> from scripts.inference import run_and_save_tts, infer_on_texts, split_and_recombine_text File "C:\Users\Administrator\Desktop\tortoise-tts-fast\scripts\inference.py", line 13, in <module> def parse_voice_str(voice_str: str, all_voices: list[str]): TypeError: 'type' object is not subscriptable

4b03ddafbf5976e5bfe82e0c4fe11e50ea1bb95d works fine

152334H commented 1 year ago

@rbychn looks like a python 3.8 incompatibility bug. I'll edit it to be backwards compat.

152334H commented 1 year ago

should be good https://github.com/152334H/tortoise-tts-fast/commit/42fe751a67367d6c2e3aff22b453eab251f15ef8

rbychn commented 1 year ago

Thank you, pulled and tried it; seems to work but this time only 1 output was shown on the webpage but in the CLI it shows it generated 3 (as selected in advanced settings) but there's also a warning on the page that it'll only choose 1 candidate, so it's a bit confusing. So not sure what's up with that, it did take a lot of time though.

152334H commented 1 year ago

Todo:

rbychn commented 1 year ago

⚠️candidates != 1 while splitting text; only choosing the first candidate for each text fragment!

The audio generated after this repeats a certain piece of text twice.

Example: tortoise tts fast. is cool Output: tortoise tts fast fast is cool

~fat-fingered and closed, sorry~

152334H commented 1 year ago

could you give an example string and seed for me to debug with

rbychn commented 1 year ago

could you give an example string and seed for me to debug with

Example String: "The expressiveness of autoregressive transformers is literally nuts! I absolutely adore them. Master the foundations of data science and spin up awesome data science projects anytime, anywhere! Expert-led tutorials. Cloud-hosted notebooks. Take your work to the cloud today."

Seed probably doesn't matter because I've checked multiple voices and it does this for all similarly sized inputs.

Voice: Freeman, Preset: High quality Advanced: Low VRAM (unchecked), Sampler: P

152334H commented 1 year ago

1 output was shown on the webpage but in the CLI it shows it generated 3 (as selected in advanced settings) but there's also a warning on the page that it'll only choose 1 candidate, so it's a bit confusing. So not sure what's up with that, it did take a lot of time though.

I have changed the functionality such that it displays all 3 samples generated.

Seed probably doesn't matter because I've checked multiple voices and it does this for all similarly sized inputs.

It matters a lot on my end. Some seeds cause the problem, others don't. Do you know if this happens with the mrq fork? It might have to do with the duration of each chunk.

rbychn commented 1 year ago

I have changed the functionality such that it displays all 3 samples generated.

Thank you!

It matters a lot on my end. Some seeds cause the problem, others don't. Do you know if this happens with the mrq fork? It might have to do with the duration of each chunk.

Will keep that in mind and try to add the seed value, but hope it's sorted now so not need. Will test it!

And no, I've not come across this issue with that fork so far.

152334H commented 1 year ago

regarding everything that was still unaddressed here: