-
Hi! Why are you using so low numbers of frame as default (32 as i see)? Voxceleb dataset wasn't preprocessing for dropping silence segments. Thus, many parts of training data is only silence. Acc is g…
-
Can I use this code to train on face videos data, basically I want to create a real-time application where I can generate smiling and blink videos in real-time based on driving videos? @Dov-Gertz @or-…
-
Hello! I would like to use WhisperX and Pyannote to combine automatic transcription and diarization. I can do it on Colab using the Huggingface (HF) token, but I would like to avoid entering the HF to…
-
帅哥您好,
谢谢您为speaker recognition领域做的贡献,学习您的这个代码快一个礼拜了,受益匪浅。代码内容和您对issues的维护对我这个小白很友好,学起来没有啃教材那么费劲。
也谢谢提供了在这个issue板块供和谐积极的讨论氛围。
我现在是用您这个deep speaker的代码在一个1080的gpu上跑voxceleb1的分,training运行有12小时了,现在是运…
-
안녕하세요, TA 정주성입니다.
숙제3의 스펙 문서와 관련하여 잘못된 수치가 명시되어 있다는 문의가 있어 공지해드립니다.
주어진 training dataset (`voxceleb-abridged-N.bin`, N=1, 2, …, 6) 에 들어 있는 data instance의 총 개수가, 스펙 문서에 나와 있는 대로 17,460개가 아닌 **약** 1…
-
从网上下载数据量过大,复现流程过久,我是做NLP的,最近在弄声纹识别,作为一个新手来说,复现流程不是很友好,有一点费劲,如果能提供一份少量voxceleb数据,能够快速复现整体流程,而不需要去一直等数据下载下来才能复现流程。
-
Hello!
ASVTorch generates 24 MFCCs, so the MFCCS are (n, 24) shape. Your input is (200, 30). Where is the 30 from? Can you please provide some test samples?
-
Hello Daniel,
I tried to run [SITW/v2](https://github.com/kaldi-asr/kaldi/tree/master/egs/sitw/v2) recipt with the latest VoxCeleb dataset. However, there is a bug error reported when I run run.s…
-
Hi,
I am trying to use the voxceleb recipe. However I am not able to download the voxceleb dataset using the provided wget command as it seems to request a password or authentification. Is the link…
-