Closed ooobsidian closed 1 year ago
Hello,
SEG_FILE=<VAD directory for set>/segments
should point to the segments given by a voice activity detection system that you have to run on the data. For example, you can run this one http://kaldi-asr.org/models/m4
RIRS_SCP=<simulated rirs directory>/simulated_rirs_16k/data/wav.scp
should point to the room impulse responses. You can download them from http://www.openslr.org/resources/26/sim_rir_16k.zip
NOISES_SCP=<MUSAN directory>/data/musan_noise_bg/wav.scp
should point to the background noises from MUSAN which you can get from https://github.com/hitachi-speech/EEND/blob/master/egs/mini_librispeech/v1/musan_bgnoise.tar.gz
RTTMS_FILE=<A single rttm file with all segments of DIHARD 3 dev full cts>
should point to a single file with the ground truth rttm segments from some dataset. We used DIHARD 3 dev full cts but you can use another set if you do not have that data.
I hope this helps.
Closing due to inactivity. Feel free to reopen
Hi! Thanks for your excellent work.
But for the similar question, I have run the prepareKaldidata_VoxCeleb2.sh file to generate the kaldi style data. How can we get the segments file?
As your paper mentioned: VoxCeleb2 consists of more than 2400 hours of recordings from more than 6000 speakers speaking mostly English. Originally prepared as a training set for training speaker recognition systems, the recordings are partially annotated. This means that for the speakers of interest, some of their segments are identified. Thus, it is possible to derive speech segments for a given speaker without the need for any VAD system.
Does it mean we don't need the segment file? In this way, how can we run the generate_data.sh successfully?
Hi @Jiang-Yidi For the experiments, we still did run a public VAD on the files to obtain the VAD segments for the sake of being sure about them. I cannot share the segments file here because it is (only a bit) larger than 25MB but you can write me an email and I'll send them attached.
Hi @fnlandini thanks for your contribution. I have also sent an email about the segments files, thanks in advance!
The segments have been uploaded here: https://github.com/BUTSpeechFIT/EEND_dataprep/tree/main/v2/VoxCeleb2/VAD
Hello! Thank you for contributing such an excellent code. I am hoping to generate a simulation dataset with VoxCeleb2 and I now have some confusion that I would like to have answered by you. First, I run
prepareKaldidata_VoxCeleb2.sh
to get Kaldidatadir. Then I rungenerate_data.sh
to generate the simulated data, but I got the following error.I noticed that I need to configure the environment variables in config_variables.sh, but I don't understand how I should configure the following fields, and how I should get these files.
Could you please explain the meaning and usage of the path in
<>
? I look forward to hearing from you, thank you.