-
Hey There! I am new to TTS models, and therefore sorry if my question is naive...
I created a simple HTTP server that receives text as input, and return the voice.
My HTTP server calls the `Tortoi…
-
I've been using faster-whisper-server via Docker for weeks with no issues with my transcription script on Ubuntu, but suddenly the server is just broken.
I get this error, whenever I try to transcr…
-
We are trying to use whisperX to align text. We already know the script and just want the start/end.
It should be easy, but the whisperx.align expects the start/end as well as the text as an input.…
-
I was doing a 700 page textbook when I discovered an error at 96% completion stating the following:
`RuntimeError: Possible latent mismatch: try recomputing voice latents. Error: Too much text provid…
-
Even after providing an audio file of 54 sec. It only provides me an one-line translation and the data loss is huge. what is the workaround even in the code I tried to change the MAX_INPUT_AUDIO_LENGT…
-
Can you please fix it?
https://huggingface.co/spaces/haoheliu/audioldm-text-to-audio-generation
-
I got this one from the tutorials. The text there is:
> You can access certain features in Soundscape with the help of the media control buttons on your headphones. This functionality works with an…
-
Media flow definition seems to be different in the text and in the JSON. Is it just one essence or more essences? The text says "_A sequence of Media Elements belonging to the same media essence flow …
-
Hi! I am from HK and just started learning about Cantonese TTS. My first goal is to train it on 林尚義's voice. I am starting with this [repo](https://github.com/hon9kon9ize/Bert-VITS2-Cantonese) as sugg…
-
While using LabelStudio I found that there is no way to create audio dataset with voice recordings.
I've got a number of utterances (texts) and want to create the dataset with different voices. But…