-
For anyone in the future who's gonna try this repo, **let me give you an easy way out**. I spent a lot of time on Speaker Recognition and the official docs say something and the samples given do somet…
-
Hi team!
Your project is great because it's fast (real-time!) and the GMMs seem quite flexible. For example, from my reading of the source code, it seems possible to run enrolment and prediction on a…
-
8-core machine could plow through diarization faster if parallelized - what's the biggest complexity stopping us from having it?
-
您好,我想训练四种语种,所以我将CAM++语种识别-中英粤日韩识别-16k当中的embedding model来初始化我的model,然后四种语种的训练时数分别为250h, 250h , 500h, 200h,每跑一个epoch 我就测试一次eval和test,test前一两个epoch acc有8成,但后面就逐渐往下掉6~7成,eval一直往上升到9成多,所以我认为是overfitting 。
…
-
Hi,
I am trying to find the d_vector for speaker diarization or speaker verification task using the AM-MobileNet1D model.
I have modified my previous inference script to compute the d_vector of …
-
```
Four wav files with following specifications :
codec : PCM s16 LE(araw)
channels : Mono
Sample rate : 8000 Hz
Bits per sample : 16
with the help of sox removing silenc…
-
```
Four wav files with following specifications :
codec : PCM s16 LE(araw)
channels : Mono
Sample rate : 8000 Hz
Bits per sample : 16
with the help of sox removing silenc…
-
Have you prepared any write up for this?
-
Broader impacts
Since VALL-E could synthesize speech that maintains speaker identity, it may carry potential risks in misuse of the model, such as spoofing voice identification or impersonating a s…
-
# Title
Image analysis validation: How can we guarantee that our algorithms perform as intended?
# Description
The importance of automatic image analysis based on artificial intelligence (AI) is …