Closed aishwaryjoshi31 closed 7 years ago
Hello,
The scp file is not generated by the toolkit. It can be generated by a Voice Activity Detector (VAD) that you might already have. You can also use existing ones in Kaldi (http://kaldi-asr.org/) or Shout (http://shout-toolkit.sourceforge.net/). Be sure to convert it into the format as shown in the example (data/scp/AMI_20050204-1206.scp).
Srikanth
Hello @aishwaryjoshi31,
I am currently using Bob for VAD. Here are some instructions how to run it:
from bob.bio.spear.preprocessor import Energy_2Gauss, Mod_4Hz, Energy_Thr, External
from bob.bio.spear.database import AudioBioFile
""" choose the VAD you want from the following options: Energy_2Gauss, Mod_4Hz, Energy_Thr, External, as told by the name External is a predefined class ready to host your preferable VAD algorithm if the available option does not meet your needs """
""" PAY ATTENTION to the default parameter values of the VAD you're using, and make sure they are compatible with the parameters of the feature extractor"""
vad = Energy_2Gauss()
#specify the directory of the audio file, the extension type and the File name
directory = 'AUDIO_FILE_DIRECTORY' # Example: '/home/resource/database/audio'
extension = '.wav' # or whatever extension you use
FileName = 'NAME_OF_YOUR_AUDIO_FILE' # Without the extension
#put your file in a format compatible with bob
myfile = AudioBioFile('', FileName, FileName)
#read the data from your audio file: fs is the sampling frequency and audio signal
fs, audio_signal = vad.read_original_data(myfile, directory, extension)
#apply the VAD algorithm to the audio signal, the result will be a set of labels per frame (labels/frame)
rate, _, labels = vad((fs, audio_signal))
The advantages for using Bob is that you have a module named bob.spear which has a lot of already implemented speaker recognition algorithms which include state-of-art ones like JFA and i-vecotrs.
Please let me know if you decided to use Bob and encountered problems. You can post your comments, questions and encountered errors on the google group discussion: https://www.idiap.ch/software/bob/discuss
Please refer to our webpage to find proper links of bob: https://www.idiap.ch/software/bob/
Thank you everyone. @akomaty How does script generates scp file? Also running script for ami data it shows following warning "Chunk (non-data) not understood, skipping it. WavFileWarning"
In fact .scp is a segmentation file, that means it lists all the speech segments with their corresponding start frame and end frame.
I'll give you an example, let's say you have an audio file named my_audio.wav. Applying VAD to my_audio.wav will result in a vector of ones and zeros, one correspond to a speech frame and zero to a silent frame.
let's suppose that my_audio.wav has only 10 frames {f1, f2, ..., f10}
. And let's also suppose that the VAD output is as follows {1, 1, 1, 1, 0, 0, 1, 1, 0, 0}
. Then your .scp file has to follow the following format:
segment_name=filename[start_frame,end_frame]
And it will look like:
my_audio_1_4=my_audio.scp[1,4]
my_audio_7_8=my_audio.scp[7,8]
Thus depending on the database you're using, and the name of your audio files, you can create a script that takes the output of the VAD as input and return a filelist like in the example above, or you can check also the file: data/scp/AMI_20050204-1206.scp
.
About the error you're getting, please provide us with the exact command you're trying to run and what files are you using.
@akomaty Please help me out on this. The scp file is generated by Bob and the feature file is obtained using HTK. Now I have both scp and fea files, so I ran the diarization command (bash run.diarizeme.sh..... command) but I am getting segmentation fault and I am unable to determine the cause for segmentation fault. Can you or anyone tell me ,what could be the possible problems that caused segmentation fault.
Readme file shows that scp file is required to diarize the meeting but how is it generated from the toolkit?