audio-text Search Results

gpt-omni/mini-omni #72

What are the advantages of audio-to-audio compared to text-t…

Mini-Omni提供了一个很棒的思路，可以将LLM结合TTS，与等待LLM流式返回后再传给TTS做合成相比，无疑在降低延时方面理论上有显著提升。但对于输入的部分，跟调用ASR后得到文本，再将文本作为模型输入相比，将语音编码后直接输入到模型有什么效果上或者延时上的优势吗？提出这样的问题主要是因为，我们在人机对话的过程中，如果要降低响应延时，怎么在vad方面做优化是一个很大的难点，如…

beetlebum233 updated 4 days ago

Mintplex-Labs/anything-llm #2347

[FEAT]: Audio to Text - Whisper / LocalAi

### What would you like to see? Hello everyone, First of all, thank you for this superb project. Would it be possible to use LocalAI for Whisper? Currently the model is Xenova Whisper which uses th…

czerr updated 6 days ago

thewh1teagle/vibe #288

[Bug]: Unable to play audio to text

### What happened? When trying to convert an audio recording into text, the process closes and stops completely, it is not in the task manager All the details are on the video ### Steps to reprod…

Watereks updated 6 days ago

open-mmlab/FoleyCrafter #11

Text to audio generator

I wonder what the pretrained text-to-audio generator used in FoleyCrafter is? Thanks for answering!

Ceaglex updated 1 month ago

AceCentre/SAPI-POC #2

Not working..

What does work - The code in root What doesnt work The code in VoiceServer. See engine.cpp is doing some neat little things to call our python file directly in voices/ We dont need to …

willwade updated 2 weeks ago

haoheliu/AudioLDM #101

Text-guided Audio-to-Audio Style Transfer

If I just want to apply Text-guided Audio-to-Audio Style Transfer for long text , will it be feasible to seamless transition from one audio to another as the prompt changes ?

PHOENIXFURY007 updated 1 month ago

ChetanXpro/nodejs-whisper #114

Too much is being logged

Verbose flag is set to `false`, yet too much is logged, like that: ``` [dev:server] [dev:server] stderr--- whisper_init_from_file_with_params_no_state: loading model from './models/ggml-base.bin…

binarykitchen updated 7 hours ago

vkohaupt/vokoscreenNG #322

Pixelation on the text for audio devices

**Describe the bug** On version 4.2.0, the text for audio devices is pixelated. In 4.2.0 beta, the text was fine. **Screenshots** ![pixelation](https://github.com/vkohaupt/vokoscreenNG/assets/9…

vivadavid updated 11 hours ago

Azure-Samples/cognitive-services-speech-sdk #2542

Text cached without audio chunks return when doing text stre…

**IN ORDER TO ASSIST YOU, PLEASE PROVIDE THE FOLLOWING:** - Speech SDK log taken from a run that exhibits the reported issue. [azure_speeck_sdk.zip](https://github.com/user-attachments/files/1662…

steven8274 updated 3 weeks ago

ASUCICREPO/waterbot #7

Audio to text

Provide the user the ability to click an icon, talk, and have user's voice interpreted as text - [x] Create small use case example - [ ] Update IAM permissions to allow Transcribe access - [ ] In…

dhe-cr updated 3 months ago

1000+ results for audio-text

1000+ results
for audio-text