stuffmatic / microdsp

DSP algorithms and utilities written in Rust. Performant, embedded friendly and no_std compatible.
MIT License
13 stars 0 forks source link

SpectralFluxNoveltyDetector: best way to determine onset timings for offline buffer? #1

Open httnn opened 1 year ago

httnn commented 1 year ago

first, thanks a lot for making this, i've already gotten very useful results!

however, one thing that's slightly confusing with the current API is how to get the determine the timing of different novelty values when processing an offline audio buffer.

when the window_size argument (passed to SpectralFluxNoveltyDetector::new) is the same as the size of chunks that are passed to SpectralFluxNoveltyDetector::process, then it seems like the handler is called each time that the process function is called but it's not obvious to me how the handler is called when the window size differs from the chunk size.

this is the code that i've used now:

  pub fn get_onsets(&self) -> Vec<f32> {
    let mut detector = SpectralFluxNoveltyDetector::new(1024);
    let mut novelties = vec![];
    let audio_data: Vec<f32> = self.get_audio_data();
    for chunk in audio_data.chunks_exact(1024) {
      let mut n = 0.0;
      detector.process(chunk, |novelty| {
        n = novelty.novelty();
      });
      novelties.push(n);
    }
    novelties
  }

perhaps there's a better way to do this that i'm missing?

stuffmatic commented 1 year ago

Internally, the novelty detector uses a WindowProcessor for collecting input samples into windows to process. The distance between windows is called the hop size (may be smaller than the window size). SpectralFluxNoveltyDetector::new sets the hop size to half the window size. You can keep track of the absolute time in samples by incrementing it by the hop size for each process callback. detector.process accepts chunks of any size and the chunks don't have to contain a whole number of windows. For a given chunk, the callback may be invoked many times or not at all depending on the size of the chunk, so you could just pass the entire audio_data to detector.process in one go.

pub fn get_onsets() -> Vec<f32> {
    let window_size = 1024;

    // SpectralFluxNoveltyDetector::new sets the hop size to half the window size
    let hop_size = window_size / 2;

    let mut detector = SpectralFluxNoveltyDetector::new(window_size);
    let mut novelties = vec![];
    let audio_data: Vec<f32> = self.get_audio_data();

    let mut num_processed_samples = 0;

    let mut n = 0.0;
    // Pass all samples to the detector at once. The callback
    // will be invoked every hop_size samples.
    detector.process(&audio_data, |novelty| {
        n = novelty.novelty();
        novelties.push(n);
        num_processed_samples += hop_size;
    });

    novelties
}

It would probably make sense to add methods for getting the window size and hop size from the detector.