irllabs / ml-lib

A machine learning library for Max and Pure Data
Other
275 stars 40 forks source link

ml.peak #41

Closed batchku closed 10 years ago

batchku commented 10 years ago

this is a useful class. however, what i really need is a peak detector that works on vectors of data as opposed to streams. here's an example: you do an fft and have the bin magnitudes for an audio signal; now you want to find the peaks in this series so that you can feed them to SVM for training and classification. we spoke about this before; our research showed that the best algorithms where actually from chemistry (lots of noisy data there).

this was among the best we found: https://github.com/xuphys/peakdetect?source=c

what do you think?

jamiebullock commented 10 years ago

That peak detection algorithm looks good, nice and simple. However, it would certainly also be possible to use the GRT peak detector in this way: it would just mean internally iterating over each element in the input vector, rather than using a continuously updated value, which is what peakdetect does. If you wanted we could introduce a "mode" attribute, which switches between these two types of operation. Perhaps we could try using GRT first, and if you're not happy with the results, look at contributing a new algorithm to GRT based on the peak detect code?

One question is, in "vector mode", how do you want the output data to be formatted? Some options are:

(for input .1 .7 .2 .5 .1 .1 .1 .8 0)

For spectral data (as in your example), I normally use interpolated peak position. This is what is done in the LibXtract xtract_peak_spectrum() function. That is we don't just get the beak bin magnitudes, but the interpolated frequency and magnitude pair. This is nice if you're building a visualiser. I'm not sure how useful it is for characterising the audio though. Most timbre classification tasks just take something like the magnitudes of the first 10 bins or something like the MFCC.

batchku commented 10 years ago

two modes sounds good; it would mean less max/pd programming and more done inside the extern for output format, "zero-indexed location value pairs"

jamiebullock commented 10 years ago

For info, I am currently waiting for a response on this, see: http://www.nickgillian.com/forum/index.php?topic=69.0

batchku commented 10 years ago

i'm looking back into ml.peak; was the peak detection problem you observed resolved? do we have a good way to find peaks in a list of floats yet?

the final missing ingredient!

jamiebullock commented 10 years ago

This functionality is now provided by ml.minmax, a custom object not based on GRT

batchku commented 10 years ago

can we get rid of ml.peak?

On Tue, May 20, 2014 at 12:15 PM, Jamie Bullock notifications@github.comwrote:

Closed #41 https://github.com/cmuartfab/ml-lib/issues/41.

Reply to this email directly or view it on GitHubhttps://github.com/cmuartfab/ml-lib/issues/41#event-122976609 .

jamiebullock commented 10 years ago

It's up to you, but personally I think it's worth keeping ml.peak (which is now working). Two reasons:

The latter could be very useful for dealing with time series data on the fly.

batchku commented 10 years ago

ok, cool. let's keep both.

On Tue, May 20, 2014 at 12:32 PM, Jamie Bullock notifications@github.comwrote:

It's up to you, but personally I think it's worth keeping ml.peak (which is now working). Two reasons:

  • It offers an alternative peak-detection method, which it may be interesting to compare to minmax
  • It offers a different approach to peak detection by detecting peaks in a continuous input stream of floats

The latter could be very useful for dealing with time series data on the fly.

Reply to this email directly or view it on GitHubhttps://github.com/cmuartfab/ml-lib/issues/41#issuecomment-43649788 .