I am not reporting a bug here but I have a question.
I am working on an app that records the user and then I display the spectrogram with WaveSurfer without any problem. What I need to add now is a voice activity detection to get the time when the user starts speaking.
I don't think I need something too powerful as the environment is quiet and the speech is short and not natural. In practice, the user will have a specific word to say and I just need to know when he says it to get his reaction time.
Why am I asking this here?
I noticed that I can very easily detect the voice activation time visually with the spectrogram that WaveSurfer draws. So I was thinking that it would be possible to make a simple VAD engine that would simple detect a dramatic change in the spectrogram. The thing is that I don't really know exactly what WaveSurfer show, what the height of each bar is representing.
So what exactly is represented with WaveSurfer? And do you think this VAD would be doable and although it would be really basic, would it work in ideal conditions?
Hi everyone,
I am not reporting a bug here but I have a question.
I am working on an app that records the user and then I display the spectrogram with WaveSurfer without any problem. What I need to add now is a voice activity detection to get the time when the user starts speaking.
I don't think I need something too powerful as the environment is quiet and the speech is short and not natural. In practice, the user will have a specific word to say and I just need to know when he says it to get his reaction time.
Why am I asking this here? I noticed that I can very easily detect the voice activation time visually with the spectrogram that WaveSurfer draws. So I was thinking that it would be possible to make a simple VAD engine that would simple detect a dramatic change in the spectrogram. The thing is that I don't really know exactly what WaveSurfer show, what the height of each bar is representing.
So what exactly is represented with WaveSurfer? And do you think this VAD would be doable and although it would be really basic, would it work in ideal conditions?
Thanks :)