katspaugh / wavesurfer.js

Audio waveform player
https://wavesurfer.xyz
BSD 3-Clause "New" or "Revised" License
8.74k stars 1.62k forks source link

detectRegions #307

Closed shsavage closed 9 years ago

shsavage commented 9 years ago

I'm working on implementing the detectRegions function from the regions plugin. I've pulled the loadRegions and detectRegions functions out of the regions example and dropped them into my javascript file, and replaced " var peaks = wavesurfer.backend.peaks;" with "var peaks = wavesurfer.backend.getPeaks();" in detectRegions. When it runs I get an "invalid arguments" error on line 325 in wavesurfer.min.js:

for (var e = this.buffer, i = e.length / t, r = ~~(i / 10) || 1, s = e.numberOfChannels, a = new Float32Array(t), n = 0; s > n; n++)

Any ideas?

katspaugh commented 9 years ago

Hi Stephen,

Thanks for getting back with this issue.

The solution is to pass a length parameter to getPeaks. Take an arbitrary value like 2048. The bigger the length, the finer will be silence detection (but in the same time slower).

to;dr:

var peaks = wavesurfer.backend.getPeaks(2048);
shsavage commented 9 years ago

Great! I'll experiment with that a bit tomorrow. It's going to be interesting to see how this works in practice, because different speakers have natural pauses of different lengths when they talk. So it will be a balancing act.

Thanks,

-Steve

On Sat, Dec 6, 2014 at 12:17 PM, katspaugh notifications@github.com wrote:

Hi Stephen,

Thanks for getting back with this issue.

The solution is to pass a length parameter to getPeaks. Take an arbitrary value like 2048. The bigger the length, the finer will be silence detection (but in the same time slower).

to;dr:

var peaks = wavesurfer.backend.getPeaks(2048);

— Reply to this email directly or view it on GitHub https://github.com/katspaugh/wavesurfer.js/issues/307#issuecomment-65910035 .

Stephen H. Savage, PhD. Scientific Software Engineer OKED/IHR Arizona State University Box 876505 Tempe, AZ 85287-6505

Affiliated Investigator - CISA3 Archaeology, Center of Interdisciplinary Science for Art, Architecture and Archaeology, Qualcomm Institute, University of California, San Diego

Senior Fellow, George Washington University, Capitol Archaeological Institute http://research.columbian.gwu.edu/archaeology/people/150

shsavage@asu.edu http://gaialab.asu.edu/home

katspaugh commented 9 years ago

It's not the length of silence, it's the size of the peaks array. :)

shsavage commented 9 years ago

Got it. I've been playing around with this to see what I can do with it. I've cut down the peaks array to 512, because of the length of time required to do 2048. I've also increased the minValue to 0.015 and the minSeconds to 0.5, as I don't want to create a whole bunch of really short segments. I've also added in some lines in the detectRegions function to draw the region on the waveform, because the function still hangs, and I wanted to see how far it was getting before it stops. You can see the modifications I made in the function, and the regions detected in a sample tape on the attached screenshot.

As you can see, the function seems to do fine until it gets to the last region, and then it seems to hang up without detecting the last region. So something needs to be added to force the final region to be done. Once I can get the last region into the regions array I can push the array and the track id to a PHP script that will create empty transcript records based on the regions detected. So that's going to be pretty cool. For the time being the regions don't have to be clickable or dragable, as the PHP script will build links into the records with the start and stop points, just like the older one did with fixed five second blocks.

I suspect that extend keeps getting set to true, so that the do while (extend) loop never ends. What do I need to do to pick up the last region?

Thanks for all your help!

-Steve

On Sun, Dec 7, 2014 at 12:32 PM, katspaugh notifications@github.com wrote:

It's not the length of silence, it's the size of the peaks array. :)

— Reply to this email directly or view it on GitHub https://github.com/katspaugh/wavesurfer.js/issues/307#issuecomment-65950172 .

Stephen H. Savage, PhD. Scientific Software Engineer OKED/IHR Arizona State University Box 876505 Tempe, AZ 85287-6505

Affiliated Investigator - CISA3 Archaeology, Center of Interdisciplinary Science for Art, Architecture and Archaeology, Qualcomm Institute, University of California, San Diego

Senior Fellow, George Washington University, Capitol Archaeological Institute http://research.columbian.gwu.edu/archaeology/people/150

shsavage@asu.edu http://gaialab.asu.edu/home

shsavage commented 9 years ago

I just noticed another small glitch. When hovering over the regions on the waveform I see the time from/to indicator. When the timespan for a region crosses the two minute mark it gets reported as 1 minute and something (e.g. 1:55 - 1:04). The next region reports properly with 2 minutes and something. If I had tapes longer than three minutes, I bet it would do the same thing across the three minute boundary.

-s

On Tue, Dec 9, 2014 at 12:15 PM, Stephen Savage shsavage@asu.edu wrote:

Got it. I've been playing around with this to see what I can do with it. I've cut down the peaks array to 512, because of the length of time required to do 2048. I've also increased the minValue to 0.015 and the minSeconds to 0.5, as I don't want to create a whole bunch of really short segments. I've also added in some lines in the detectRegions function to draw the region on the waveform, because the function still hangs, and I wanted to see how far it was getting before it stops. You can see the modifications I made in the function, and the regions detected in a sample tape on the attached screenshot.

As you can see, the function seems to do fine until it gets to the last region, and then it seems to hang up without detecting the last region. So something needs to be added to force the final region to be done. Once I can get the last region into the regions array I can push the array and the track id to a PHP script that will create empty transcript records based on the regions detected. So that's going to be pretty cool. For the time being the regions don't have to be clickable or dragable, as the PHP script will build links into the records with the start and stop points, just like the older one did with fixed five second blocks.

I suspect that extend keeps getting set to true, so that the do while (extend) loop never ends. What do I need to do to pick up the last region?

Thanks for all your help!

-Steve

On Sun, Dec 7, 2014 at 12:32 PM, katspaugh notifications@github.com wrote:

It's not the length of silence, it's the size of the peaks array. :)

— Reply to this email directly or view it on GitHub https://github.com/katspaugh/wavesurfer.js/issues/307#issuecomment-65950172 .

Stephen H. Savage, PhD. Scientific Software Engineer OKED/IHR Arizona State University Box 876505 Tempe, AZ 85287-6505

Affiliated Investigator - CISA3 Archaeology, Center of Interdisciplinary Science for Art, Architecture and Archaeology, Qualcomm Institute, University of California, San Diego

Senior Fellow, George Washington University, Capitol Archaeological Institute http://research.columbian.gwu.edu/archaeology/people/150

shsavage@asu.edu http://gaialab.asu.edu/home

Stephen H. Savage, PhD. Scientific Software Engineer OKED/IHR Arizona State University Box 876505 Tempe, AZ 85287-6505

Affiliated Investigator - CISA3 Archaeology, Center of Interdisciplinary Science for Art, Architecture and Archaeology, Qualcomm Institute, University of California, San Diego

Senior Fellow, George Washington University, Capitol Archaeological Institute http://research.columbian.gwu.edu/archaeology/people/150

shsavage@asu.edu http://gaialab.asu.edu/home

katspaugh commented 9 years ago

Stephen,

I've refactored the extractRegions function, it won't stuck in an infinite loop now.

I've also changed the peaks and duration to be the parameters of this function. So you should call it like this:

        loadRegions(
             extractRegions(
                 wavesurfer.backend.getPeaks(512),
                 wavesurfer.getDuration()
             )
        );

Please see if it works for your audio files.

As for the time label bug, I'll look into it tomorrow. Thanks for finding it!

shsavage commented 9 years ago

I'll give it a try tomorrow as well. In the meantime I was messing around with the detectRegions function, and found that if I changed "while (extend)" to "while (extend && i < length)" it wouldn't get stuck in the loop either. And I also hacked a solution to the missing region at the end of the audio file:

// NOTE: The "+ 4" was added to the next line to force the function to

find the last region. var duration = wavesurfer.getDuration() + 4;

I know. It's a kludge. So I'll take a look at your solution tomorrow.

-s

On Tue, Dec 9, 2014 at 2:17 PM, katspaugh notifications@github.com wrote:

Stephen,

I've refactored the extractRegions https://github.com/katspaugh/wavesurfer.js/blob/master/example/annotation/app.js#L136 function, it won't stuck in an infinite loop now.

I've also changed the peaks and duration to be the parameters of this function. So you should call it like this:

    loadRegions(
         extractRegions(
             wavesurfer.backend.getPeaks(512),
             wavesurfer.getDuration()
         )
    );

Please see if it works for your audio files.

As for the time label bug, I'll look into it tomorrow. Thanks for finding it!

— Reply to this email directly or view it on GitHub https://github.com/katspaugh/wavesurfer.js/issues/307#issuecomment-66359318 .

Stephen H. Savage, PhD. Scientific Software Engineer OKED/IHR Arizona State University Box 876505 Tempe, AZ 85287-6505

Affiliated Investigator - CISA3 Archaeology, Center of Interdisciplinary Science for Art, Architecture and Archaeology, Qualcomm Institute, University of California, San Diego

Senior Fellow, George Washington University, Capitol Archaeological Institute http://research.columbian.gwu.edu/archaeology/people/150

shsavage@asu.edu http://gaialab.asu.edu/home

shsavage commented 9 years ago

This is working quite nicely, save for one small glitch. If you take a look at the attached screenshot, you'll see that there's a small part of the waveform at the end of the file that's not included in the region (I've highlighted it with the red box).

I've implemented a couple of slider controls to allow fine tuning of the silence parameters, and I've made the regions resizable (but not draggable), so I can mess with the cutoff points before actually using the regions to create transcription blocks. So now the "Segment" button first clears any existing regions, then draws regions based on the slider values. After tweaking this as well as can be achieved by the segmenter, the user can further adjust with the regions resizing tool, and then click the "Transcribe" button, which will send the regions array on to my PHP script.

So I am assuming that when I resize a region, it's start or stop values will be automatically updated in your regions array, which would mean that when I hit the "Transcribe" button I would need to do a final read of your regions array into myRegions, the one I'll be sending to the PHP script. Otherwise I'd have to capture the resize event each time it fires, figure out which region I am on, fetch the new start and end values and then update the myRegions array as the user works on it.

I'll await your fix of the time range display. Again, thanks for working with me on this!

-Steve

On Tue, Dec 9, 2014 at 2:17 PM, katspaugh notifications@github.com wrote:

Stephen,

I've refactored the extractRegions https://github.com/katspaugh/wavesurfer.js/blob/master/example/annotation/app.js#L136 function, it won't stuck in an infinite loop now.

I've also changed the peaks and duration to be the parameters of this function. So you should call it like this:

    loadRegions(
         extractRegions(
             wavesurfer.backend.getPeaks(512),
             wavesurfer.getDuration()
         )
    );

Please see if it works for your audio files.

As for the time label bug, I'll look into it tomorrow. Thanks for finding it!

— Reply to this email directly or view it on GitHub https://github.com/katspaugh/wavesurfer.js/issues/307#issuecomment-66359318 .

Stephen H. Savage, PhD. Scientific Software Engineer OKED/IHR Arizona State University Box 876505 Tempe, AZ 85287-6505

Affiliated Investigator - CISA3 Archaeology, Center of Interdisciplinary Science for Art, Architecture and Archaeology, Qualcomm Institute, University of California, San Diego

Senior Fellow, George Washington University, Capitol Archaeological Institute http://research.columbian.gwu.edu/archaeology/people/150

shsavage@asu.edu http://gaialab.asu.edu/home

amundo commented 9 years ago

Hi katspaugh and shsavage,

I'm following this thread with interest. I haven't been able to get the .getPeaks(512) function to do anything yet — I call it after loading a new audio file, but it (immediately) returns an empty array.

I see that extractRegions is commented out in app.js and the wavesurfer instance seems to be getting its data on silences from rashomon.json in the example. Is there an example somewhere which demonstrates how to get the peaks client side through using extractRegions?

shsavage commented 9 years ago

Hi Patrick,

I'm doing it like this, with a lot of help from katspaugh:

//------------------------------------------------------------------------------------------------- // findRegions() -- Controls the detection and drawing of audio regions. // Invoked through button press on segmentation controls. //------------------------------------------------------------------------------------------------- function findRegions() { wavesurfer.clearRegions(); var peaks = wavesurfer.backend.getPeaks(512); var duration = wavesurfer.getDuration(); var wsRegions = extractRegions(peaks, duration); drawRegions(wsRegions); }

//------------------------------------------------------------------------------------------------- // Draw regions from extractRegions function. //------------------------------------------------------------------------------------------------- function drawRegions(regions) { regions.forEach(function (region) { region.drag = false; region.resize = true; region.color = randomColor(0.2); wavesurfer.addRegion(region); }); }

//------------------------------------------------------------------------------------------------- // Extract regions separated by silence. //------------------------------------------------------------------------------------------------- function extractRegions(peaks, duration) { var length = peaks.length; var coef = duration / length; var minLen = minSeconds / coef;

// Gather silence indeces
var silences = [];
Array.prototype.forEach.call(peaks, function (val, index) {
    if (val < minValue) {
        silences.push(index);
    }
});

// Cluster silence values
var clusters = [];
silences.forEach(function (val, index) {
    if (clusters.length && val == silences[index - 1] + 1) {
        clusters[clusters.length - 1].push(val);
    } else {
        clusters.push([ val ]);
    }
});

// Filter silence clusters by minimum length
var fClusters = clusters.filter(function (cluster) {
    return cluster.length >= minLen;
});

// Create regions on the edges of silences
var regions = fClusters.map(function (cluster, index) {
    var next = fClusters[index + 1];
    return {
        start: cluster[cluster.length - 1],
        end: (next ? next[0] : length - 1)
    };
});

// Add an initial region if the audio doesn't start with silence
var firstCluster = fClusters[0];
if (firstCluster && firstCluster[0] != 0) {
    regions.unshift({
        start: 0,
        end: firstCluster[firstCluster.length - 1]
    });
}

// Filter regions by minimum length
var fRegions = regions.filter(function (reg) {
    return reg.end - reg.start >= minLen;
});

// Return time-based regions
return fRegions.map(function (reg) {
    return {
        start: Math.round(reg.start * coef * 10) / 10,
        end: Math.round(reg.end * coef * 10) / 10
    };
});

}

Hope this helps,

-Steve

On Mon, Dec 15, 2014 at 1:32 PM, Patrick Hall notifications@github.com wrote:

Hi katspaugh and shsavage,

I'm following this thread with interest. I haven't been able to get the .getPeaks(512) function to do anything yet — I call it after loading a new audio file, but it (immediately) returns an empty array.

I see that extractRegions is commented out in app.js and the wavesurfer instance seems to be getting its data on silences from rashomon.json in the example. Is there an example somewhere which demonstrates how to get the peaks client side through using extractRegions?

— Reply to this email directly or view it on GitHub https://github.com/katspaugh/wavesurfer.js/issues/307#issuecomment-67060351 .

Stephen H. Savage, PhD. Scientific Software Engineer OKED/IHR Arizona State University Box 876505 Tempe, AZ 85287-6505

Affiliated Investigator - CISA3 Archaeology, Center of Interdisciplinary Science for Art, Architecture and Archaeology, Qualcomm Institute, University of California, San Diego

Senior Fellow, George Washington University, Capitol Archaeological Institute http://research.columbian.gwu.edu/archaeology/people/150

shsavage@asu.edu http://gaialab.asu.edu/home

katspaugh commented 9 years ago

Hey Patrick,

You should be calling getPeaks on ready event, as the audio loads asynchronously. Are you?

amundo commented 9 years ago

Thanks both of you, made some progress running Steven’s example, and this seems to be working:

https://gist.github.com/amundo/99365f38fefbd6cb8b43

Here’s how I interpreted your comment about running getPeaks on the ready event:

wavesurfer.on('ready', function () {
  var peaks = wavesurfer.backend.getPeaks(512);
  var regions = extractRegions(peaks, wavesurfer.backend.getDuration());
  drawRegions(regions);
  wavesurfer.play();
});

The silence detection works great!

Now to figure out how to hook up the regions and start building out an annotation interface to support the languages I work with. Thanks so much for sharing this project!

maxehnert commented 9 years ago

@katspaugh hey I was just looking over this issue and I'm trying to get a grasp on what is going on.

Is extractRegions() supposed to automagically find and highlight all the peaks in the audio file within set params? I'm trying to call extractRegions() with some params but all I get back is an empty array. I have uncommented out this section:

loadRegions(
                extractRegions(
                    wavesurfer.backend.getPeaks(512),
                    wavesurfer.getDuration()
                )
            );

from the regions example. Is there something else I am missing or am I interpreting this function incorrectly?

katspaugh commented 9 years ago

Hi Max!

The function finds regions separated by silence. Like separate utterances in speech.

I'm not sure what's missing in your code. Are you trying on the Akutagawa story audio file or your own?

maxehnert commented 9 years ago

I have my own file. I'm working with heart sounds. The peaks in my files are only about 0.1 seconds long and often shorter.

katspaugh commented 9 years ago

Then you may need to set the minSeconds variable to 0.

shsavage commented 9 years ago

The only time that variable is used that I can see is in the segmentation of the audio track into regions. I'm talking about just changing the cursor when it's hovering over the annotation marker, which is a region of no length. Currently, I'm using this line to add a region as an annotation marker:

        wavesurfer.addRegion({id: '1', start: ohp.myStart, loop: false,

drag: false, resize: false, color: 'rgba(0,128,0,1)'});

What would happen if I gave it an end option with the same location (end: ohp.myStart)?

-s

On Fri, Apr 3, 2015 at 1:32 PM, katspaugh notifications@github.com wrote:

Then you may need to set the minSeconds variable to 0.

— Reply to this email directly or view it on GitHub https://github.com/katspaugh/wavesurfer.js/issues/307#issuecomment-89411684 .

Stephen H. Savage, PhD. Scientific Software Engineer OKED/IHR Arizona State University Box 876505 Tempe, AZ 85287-6505

Affiliated Investigator - CISA3 Archaeology, Center of Interdisciplinary Science for Art, Architecture and Archaeology, Qualcomm Institute, University of California, San Diego

Senior Fellow, George Washington University, Capitol Archaeological Institute http://research.columbian.gwu.edu/archaeology/people/150

shsavage@asu.edu http://gaialab.asu.edu/home

maxehnert commented 9 years ago

@shsavage I'm not sure if you were responding to me but I'm doing this with a button to create markers

wavesurfer.addRegion({start: wavesurfer.getCurrentTime(), end: wavesurfer.getCurrentTime() + 0.004, color: 'RGBA(91, 199, 0, 1)', height:'50px'}, saveRegions);

If I don't add the 0.004 to the end then nothing shows up since it has no width.

@katspaugh minSeconds gets built into minLen which is only used for filtering silence and region segments

Just to clarify my original question. Am I correct to assume that calling extractRegions() is supposed to locate all the peaks in the file that meet the correct params and add regions to them?

katspaugh commented 9 years ago

That's another topic, but to create a marker, just don't specify any end and it should work.

Back to extractRegions, it is supposed to locate continuous segments of audio signal of high intensity, separated by segments of continuous signal of lower intensity ("silence"). The threshold of the intensity is set in the variable minValue.

maxehnert commented 9 years ago

Thanks @katspaugh I will try some more this week with the extractRegions and see if I can get something working.