voice-activity-detection Search Results

wiseman/py-webrtcvad #68

Quality benchmarks between audiotok / webrtcvad / silero-vad

# Instruments We have compared 3 easy-to-use **off-the-shelf instruments for voice activity / audio activity detection**: - Silero-vad from here - https://github.com/snakers4/silero-vad; - A po…

snakers4 updated 3 years ago

google/visqol #104

Input sample rate 8kHz

Hi, I noticed there are two modes: Audio for 48kHz and Speech for 16kHz. Would the score be accurate if both the reference and degraded samples were at an 8kHz sample rate in speech mode? I receive…

ray2510 updated 8 months ago

pytorch/audio #2986

Discern music from spoken word

### 🚀 The feature I'm wondering if there are any researchers out there that can search an audio stream like an mp3 and determine whether or not the track is purely spoken word versus a song or musi…

ifeatu updated 1 year ago

voithru/voice-activity-detection #3

pretrained checkpionts

Hello, Thanks for your interesting work. I do want to check if the pre-trained checkpoints are available

Gaber-Youssef updated 1 year ago

ibab/tensorflow-wavenet #287

Question - WaveNet as feature extractor from audio

Hi, first, thanks for this implementation of WaveNet! I'm interested in performing feature extraction from raw audio files. this features will be used for different tasks such as voice activity de…

iariav updated 5 years ago

cofacts/rumors-api #322

Reduce Text-to-speech hallucination

> From 2023/10/11 meeting https://g0v.hackmd.io/t9ypB87SQBuMjjW_PheZVg#Comm-AI-transcript The current implementation for speech-to-text (based on Whisper API) suffers from hallucination problems. S…

johnson-liang updated 2 months ago

xenova/transformers.js #322

[Feature request] speaker-diarization model

**Name of the feature** *In general, the feature you want added should be supported by HuggingFace's [transformers](https://github.com/huggingface/transformers) library:* - *If requesting a **model…

HyunjunA updated 10 months ago

terminal-discord/weechat-discord #48

Voice chat support

Depending on how hackable the ncurses interface is would it be possible to have actual voice chat support?

h1nk updated 4 years ago

serenadeai/speech-recorder #34

Issue running on M1 Mac

Hi, I am currently trying to implement the speech-recorder Voice Activity Detection in my electron App on my M1 Mac and I am facing the current issue : `Error: dlopen(/myElectronPath/node_modules…

halfperim updated 2 months ago

gpt-omni/mini-omni #72

What are the advantages of audio-to-audio compared to text-t…

Mini-Omni提供了一个很棒的思路，可以将LLM结合TTS，与等待LLM流式返回后再传给TTS做合成相比，无疑在降低延时方面理论上有显著提升。但对于输入的部分，跟调用ASR后得到文本，再将文本作为模型输入相比，将语音编码后直接输入到模型有什么效果上或者延时上的优势吗？提出这样的问题主要是因为，我们在人机对话的过程中，如果要降低响应延时，怎么在vad方面做优化是一个很大的难点，如…

beetlebum233 updated 1 week ago

1000+ results for voice-activity-detection

1000+ results
for voice-activity-detection