-
Currently in my application we use two recorders: the `speech-recorder` and the browser's recorder. The browser recorder works fine, but lacks the excelent VAD available in the `speech-recorder`. By s…
-
Hi,
I’m currently using RealtimeSTT with the following configuration:
```
recorder_config = {
'spinner': False,
'model': 'large-v2',
'language': 'en',
'silero_sensitivity': …
zbeb updated
3 months ago
-
1:
Some audio files appear to have been cut off. For example: `accept/64926.ogg` in the `tetiana` dataset (original text is "Уве́чері при ля́мпі ми сиді́ли в кімна́ті вчи́теля і розмовля́ли.").
…
-
## ❓ Questions and Help
I found that when using silero-vad for voice activity detection in vocal songs, it misses most of the high-pitched parts. I'm wondering if this is related to the project's t…
-
Hi All,
How are you?
We would like to adapt the VAD model to a new domain / case which is not handled in the current version.
Is it possible to fine-tune the current VAD? if not can you add a tunea…
-
Find out what the expected error rate is, find out what's possible.
https://github.com/kadirnar/whisper-plus
Papers
https://arxiv.org/pdf/2212.04356
WER:
https://pubs.aip.org/asa/jel/articl…
-
when i run
`model.diarize_list("wav.scp")`
i get this error after 2 -3 files get processed
ValueError: need at least one array to stack
but i am able to diarize files individually. I can't dia…
-
Hi!
Many thanks for the brilliant work!
When executing demo_part1.ipynb step 4:
```
reference_speaker = 'resources/example_reference.mp3'
target_se, audio_name = se_extractor.get_se(reference_sp…
-
-
Instead of pressing a key, continuously listen till the wake work is announced (i.e., "Hey Ross")