Closed udaynag closed 7 years ago
The scp file contains speech/non-speech information. We assume that voice activity detection has been applied already prior to diarization. You could use a simple energy-based detector (like something available in Kaldi) or Shout.
Let me know if you need more information.
Thanks, Srikanth
Thanks for the feedback and a prompt response.
Regards, Uday
Hi Srikanth,
Using the toolkit, I am not able to get past the initial file read for all the audio files I am using, My output file shows the following warning at the end. Any idea why this would happen. I am using beamformed output from AMI corpus. Thanks in advance -
Reading and Processing the scp file number of segments inside = 482 Number of vectors = 118855 Feat file data/mfcc/EN2001a_30m.fea of type 0 Reading feature file data/mfcc/EN2001a_30m.fea frame dim: 6 num_vec = 118855 Attempting memory allocation. Memory successully allocated. Reading file ... Warning frame_val is nan for idx = 1775 and d_index == 4 f_index = 1775
Hello,
It means the feature itself has a NaN value. Check 1776th feature vector, 5th dimension in EN2001a_30m.fea.
I am under the impression IB diarization includes segmentation as one of the initial steps. but , it looks like this toolkit requires segments to be defined in scp file. Is this a limitation of this toolkit or am I missing anything ? How do we get the initial segment boundaries for recorded data. Thanks !