-
In the recent 1-2 weeks many of my users are reporting a decrease in speech detection quality, and I am struggling to understand what that could be. I noticed an increase in "SpeechNotRecognized" even…
-
ESH has a basic tagging implementation, which allows to add tags (as simple strings) to items.
The idea behind this is to assign items a semantic. So while the "category" refers to a taxonomy (e.g. th…
-
-
我这边想进行情感识别时,将prompt='{audio_url}'时,出来的结果是:
assets/audio/1.wav普通话, 女声, 31岁今天天气真好
可以看到上面的结果,包含普通话,性别,年龄和文本,但是就是没有情感,那么写prompt的时候,要怎么写才能获得单个task的信息或者想要的task的信息。
-
ICU's BreakIterator has clear limitations in its approach for character-based languages without textual word boundaries. When used directly, it allows you to specify a dictionary to work around limita…
-
### Tested versions
Library | Version |
|:----------------|:-------:|
Python | 3.12.2 |
Pyannote.audio | 3.1.1 |
Pyannote.core | 5.0.0 |
### Sys…
-
I've isolated a bottleneck from our production environment and here's a nifty self-contained benchmark for it: https://gist.github.com/tmcw/1a4e8ee47941454337dc5952dbf90180 (swap require('./') for req…
tmcw updated
9 months ago
-
### What happened?
Hi Team,
I'm using JS SDK capturing the speech using SpeechSDK.AudioConfig.fromDefaultMicrophoneInput, If the teams/zoom call is going on through the desktop app, teams/zoom ca…
-
This enhancement will be particularly beneficial for transcribing meetings, interviews, gaming sessions, and podcasts involving multiple speakers, enabling users to distinguish who is speaking at an…
-
Dataloader name: `kheng_info_speech/kheng_info_speech.py`
DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?kheng_info_speech
| Dataset| kheng_info_speech |
|-------------|---…