-
Thank you very much for this wonderful program, it has very high accuracy levels and is helping me so much in many ways :)
But unfortunately Speech Note keeps inserting words that I didn't speak ra…
-
Currently, there is no functional way to change the Text-to-Speech (TTS) language in our application. While the system is intended to support French ("fr") as a language option, this setting is not be…
-
hello, i'm new to speech recognition, vosx and python, but i want to translate speech from a simple video i downloaded from the internet (and later even tts'ing to my language or even speech to speech…
-
# RFW0097: *Improve the scaling of AI models API*
## Named Concepts
API (Application programming interface): is a set of rules and protocols that defines how two software systems can communicate w…
-
I am using WhisperX v2.0.1 with the option "detect language" (by omitting the "--language" option from the command). I use the "detect language" option for a video in which two languages are spoken, E…
oep42 updated
5 months ago
-
1,是只能微调LLM吗,我使用模型只能使用文字对话,不能传输语音数据。意思是只能使用LLM,没有speech encoder,speech adaptor?现阶段是否有论文上的stag1的微调,请告知,谢谢。
![image](https://github.com/user-attachments/assets/80829c0d-cf0a-4cf1-b2a5-a087f1037f6f)
-
I tested some videos
if the silence duration is long , then enable vad_filter will be effective
but if video is as normal, then enable vad_filter may cause more timestamp mismatch
is there …
-
### Specs
- Leon version:1.0.0-beta.9+dev
- OS (or browser) version:ubantu 22.04
- Node.js version:v18.16.0
- Complete "leon check" (or "npm run check") output:
xu@xu-ThinkPad-Edge-E431:/…
-
**Datamodels needed:**
OpenAI
ElevenLab Text to Speech
VLM - visual language model (OpenAI GPT-4V)
Whisper Speech to Text
Basis for bot behavior
OpenAI GPT-4 phenomenological problem interviewer prom…
-
## Text To Speech Preprocessing
- [ParsiNorm](https://github.com/haraai/ParsiNorm) - Persain Text Pre-Proceesing Tool
- [Persian Tools](https://github.com/persian-tools/py-persian-tools) - An anthol…