-
### Description
As far as I understand, this should take in an Audio or a ScriptLine and output a Language object
### Tasks
- [ ] Select default model for either audio or text
- [ ] Get working wi…
-
### Describe the bug
I am doing emotion classification on waveforms using the speechbrain IEMOCAP hugging face interface. The code was executed perfectly. But meanwhile it creates a lot fo softlinks …
-
trying to install the Core ML support on a macbook pro m3
running
```
./models/generate-coreml-model.sh base.en
Torch version 2.3.1 has not been tested with coremltools. You may run into unexpe…
-
I'm trying to use STT but sometimes it works and sometimes no,
Please tell me what I need to do to fix that.
```
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/pytho…
-
build a transcription integration flow that sends an .mp3 and sends it to whisper (which i need to deploy) and then returns the text from the audio (use a Spring INtegration gateway perhaps?)
-
*Describe the bug*
Certain TTS voices are not providing speechmarks with viseme timings. For example, all the Urdu Azure TTS voices provide word timings but do not provide viseme timings which is w…
-
Find a library that supports text to audio, or audio to text.
-
### Describe the bug
I was testing the Bengali Voice model and it missed the Bengali number pronunciation. Bengali numbers
০ ১ ২ ৩ ৪ ৫ ৬ ৭ ৮ ৯
0 1 2 3 4 5 6 7 8 9.
১৯৫৪ সাল। কালো রাত। Here is su…
-
### Details
_No response_
### Branch
_No response_
-
The voice cloning and prosody cloning are amazing. But i want to clone the prosody but synthesize speech in another language. Not having any luck so far, any help?
I noticed the models only accepts…