-
I am running WhisperX with large-v3 model.
**When an audio is given, the transcription output ignores last 7-8 seconds and gives smaller transcript than the original answer.**
- I evaluated the …
-
Currently there is only a threshold value in frequency spectrun variance that decides if a sample is noise or speech. Options to look into:
- Noise gate: http://en.wikipedia.org/wiki/Noise_gate
- Slid…
-
As a Riff Developer, I am not confident that our speech detection is working correctly, based on the code that I've seen. Specifically, I'm concerned that we are not properly detecting actual speech v…
-
大佬,我在用whisper推理我业务数据的时候,经常出现连续很长的字或词的问题,有什么好的解决办法吗
wntg updated
6 months ago
-
I've been thinking a lot about this code fragment in
https://github.com/kyungyunlee/ismir2018-revisiting-svd/blob/master/leglaive_lstm/audio_processor.py
in function process_single_audio (Compute d…
-
First of all, I thank you Georgi Gerganov and all who contributed to this project.
I have a progressive neuro-muscular disease and I almost can not use my hands. I bought a new android mobile to ease…
-
运行环境:
操作系统:linux
python:3.8.16
modelscope:1.9.4
funasr: 0.8.4
gpu:T4 cuda:11.6
代码
```
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
from modelscope.…
-
I would like to know is there any possible way we can get the timestamp of each word using wave2letter architecture? If so, how should we do it, please let me know regarding the same
-
When asking a group of people produce clear speech, it's typical to observe wide variations such as how slow and 'clear' it is. Consultants often drift towards conversational speech rates to such a de…
-
Here I will post our benchmarks comparing these three instruments