-
Thank you for sharing your git.
My question is same above.
Does this work with the unseen speech?
-
Thanks for sharing this dataset!
I plan to train and evaluate [pyannote](https://github.com/pyannote/pyannote-audio) speaker diarization pipelines on AISHELL-4.
1. I'd like to understand the sp…
-
I want to develop an i-vector based speaker verification system. But I didn't know the exactly meaning of the parameter, backgroundNdxFilename, in IvTest. Would you please explain the meaning of this …
-
In that paper I do not find where the explanation for TDNN. And I do not understand what the function cmvn is and why. Could you explain for me please?
-
Code:
```
import whisperx
import gc
device = "cuda"
audio_file = r"out.wav"
batch_size = 16 # reduce if low on GPU mem
compute_type = "float16" # change to "int8" if low on GPU mem (may r…
-
What does voxceleb2 header fields mean?
```
Offset : -2
FV Conf : 16.303 (1)
ASD Conf : 6.201
```
-
# 16 kHz 2kbps
## parameter size:
encoder (including quantizer) : 29MB decoder: 40MB
### exps/results.txt
Codec SUPERB application evaluation
Stage 1: Run speech emotion recognition.
Acc: 74.…
-
No have resources for making train. Can please share model with inference code?
Thanks you
-
Hi,
Tried executing your project ! Able to run all the python files without any errors, but how can I create a model for the same ? And no Model folder is generated inside the current working direc…
-
When I try to access the swahili stopwords using the below feature, I'm getting a traceback that the swahili stopwords are missing in the documentation, I'm currently working on building a swahili lan…