Closed RobinSchmidt closed 2 years ago
at first, i found the description of the peak-prominence algorithm confusing because of the minima and maxima - but now i got it (i hope). it works as follows (in a 2D setting, i.e. a landscape above a plane):
for each peak
for each direction (say north, north-east, east, etc.)
walk until you find a point with higher elevation than the current peak
note the minimum elevation along the walked line
endfor
take the maximum of these noted minimum elevations
the difference of peak-height and maximum-minimum-elevation is the peak's prominence
endfor
with 2D, a practical problem may be that there are actually infinitely many possible directions to consider but in our case with 1D data, there are only two directions to consider - left and right, so the algorithm should be straightforward to implement. scipy has it also:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.peak_prominences.html
i actually think, that a relative prominence, i.e. peak-prominence divided by peak-height might be an even better measure of peak relevance. i will try both
now i got it (i hope).
nope! i didn't!
with 2D, a practical problem may be that there are actually infinitely many possible directions to consider
it's actually even much worse! i'm referring to the first definition here:
https://en.wikipedia.org/wiki/Topographic_prominence#Definitions
it's not only infinitely many directions of straight lines that would have to be considered (which would still be managable in an approximate way with a loop over a bunch of angles - like 0°, 10°,...350°) but infinitely many arbitrarily shaped paths! :-O i have no idea, how one would approach that algorithmically (edit: now i have one - but don't know, if that would be practical). ...fortunately, these complications don't apply to the 1D case that we are dealing with. there's no path-shape to be considered here. puuh!
as for the second idea: the book says that relevant peaks tend to be present on multiple scales. i think, we don't really need to deal with the continuous wavelet transform to arrive at an algorithm based on this idea. my idea is the following: find the peaks, then apply a smoothing filter to the data and figure out which peaks are still peaks after smoothing (i think, we need to tolerate the peak-index to shift one position left or right in each iteration for this check) - then iterate that. the number of smoothing passes, a peak survives, can be used as a measure for that peak's relevance.
here, i have plotted an array of random numbers and successively more smoothed versions of it (with a simple 3-point moving average):
it seems that the visually most significant peaks would indeed survive the smoothing best. or...well...not sure...maybe...i guess using random numbers as input is perhaps not the best test of this idea. in keeping with the analogy with landscapes, i would consider the iterative smoothing as a sort of abrasion due to wind and rain - and the measure would be a sort of peak-resilience against harsh weather conditions. ...or something
perhaps a much better test input is a random walk - i.e. the cumulative sum of an array of random numbers - here is one, together with the 8-th iteration of the smoothing filter:
those peaks that have corresponding peaks in the smoothed version would be picked, the others would be dismissed as spurious by such an algorithm. i think, that looks promising. perhaps, combination of both criteria - prominence and smoothing-resilience would be best
i'm playing around with some ideas. here is what happens when i iteratively apply smoothing and take the elementwise maximum with the original data:
i think i will call this the ropeway algorithm :-) i think, it may be useful as a pre-processing step before finding peaks. it tends to de-peakify a lot of spurious peaks (i.e. turn them into non-peaks)
i think i will call this the ropeway algorithm
...although, i'm not sure, if the valleys are catenary-shaped. they actually look more like parabolas to me
the idea of alternatingly smoothing and taking the element-wise maximum was inspired by the true-envelope algorithm, which i have implemented a long time ago in matlab as part of my master thesis:
https://hal.archives-ouvertes.fr/hal-01161334/document
it uses cepstral smoothing - but the details of how the smoothing is implemented are probably not really essential to the basic idea of the algorithm
The envelope estimator examples shown in the paper look good to me--that could be good to try, especially since you've implemented it in the past.
Your "ropeway" algorithm might be promising, too. Is the way the peaks come to points a concern, though? I wonder if the peaks were treated like end points in a bezier curve, you could ensure that the envelope approaches and departs the point smoothly without affecting the point's value (since smoothing/averaging would tend to round off the peaks).
Is the way the peaks come to points a concern,
i'm not sure, if i understand that question. do you mean, if i'm concerned with the shapes of the curves that run through the peaks? if so...well - eventually, when we use the peaks for estimating an envelope - yes - but in the plot above, i'm actually only concerned about finding the (relevant) peaks. what sort of interpolant we later let pass through them, is the next step. in the simplest, case, one could just connect them with straight lines. or one could use a spline. ...for example, the cubic natural splines that i implemented previously, could be used
you could ensure that the envelope approaches and departs the point smoothly without affecting the point's value (since smoothing/averaging would tend to round off the peaks
yes - on the other hand: although using spline curves would ensure that the curve passes (smoothly) through the points, but the spline curve may overshoot a peak (in the neighborhood of the peak). here are some random points connected by lines, and hermite splines of orders 3,5,7 - you can claarly see the overshoot
so, which behavior is better - rounding the peak off or potentially overshooting it - may be a matter of judgement
or maybe even better is this plot - these are a couple of random numbers (x- and y-coordinates are random), interpolated by a natural cubic spline: connections are smooth (in fact, even 2nd order smooth - i.e. 1st and 2nd derivatives match at the junctions)...but the interpolants tend to oscillate/overshoot - which may be an undesirable feature for interpolating amplitude envelopes. ....for interpolating audio data - on the other hand - these oscillations are actually desirable - they can be seen as approximation to sinc interpolation
i'm not sure, if i understand that question. do you mean, if i'm concerned with the shapes of the curves that run through the peaks?
Something like that, but I missed that this was in a separate thread with a separate (but related) goal of finding relevant peaks. :P
i have implemented a more efficient variation of my "ropeway" idea - the original algo was really cpu expensive due to the iterative smoothing and maximum-taking (this required many passes through the data). the following is more efficient (it needs just 2 passes through the data): basically, run an envelope follower with zero attack and some adjustable release - once forward, one backward through the data (in (pseuo)parallel) and then take the element-wise maximum of these two passes. this can be seen here:
black: original envelope/signal/"landscape", blue: right-to-left pass, green: left-to-right pass, red: imagine the blue and green functions to be combined by taking their element-wise maximum and connect the peaks of the result:
the idea is that larger peaks cast "shadows" and the smaller peaks will be ignored, if they fall under the shadow of a larger peak. i call this "peak-shadowing". the widths of shadows can be asymmetrical, too - it may make sense to have larger rightward shadows (here in green) in the case of audio envelopes
here are envelopes that result with various settings (green, red, purple: half-widths of the shadows set to 10, 20, 30), blue: "fine" envelope (all peaks of the original data are taken to be relevant), cyan: "coarse" envelope (the only condition is that nothing should "stick out")
here, i have an artificially created an envelope that shows some features that are typical for the rhodes samples - and i have added these little spikes in the troughs - these are the irrelevant peaks that i want to get ignored by the peak-picker (black is the envelope, blue the "shadowed" envelope). as can be seen, in the shadowed envelope, the spikes fall under the (blue) shadows and hence will be ignored (the shadows are asymmetrical here, leftward shadows are not visible): ...well - not all of them - those more to the right won't because the actually are higher than the surrounding. and this is what the algo then would finally take as the peak-envelope
as can be seen - there's this other problem that the decay-section will just be interpolated by a straight line. to solve this, i need some sort of artificial "densification" of the envelope in areas where there aren't any peaks around. this is all well and good - but what what i'm currently grappling with, is how to set the parameters of the shadowing algorithm automatically. ...as well as some issues, how exactly the "artificial densification" should work
HMMM, this might not be needed but wondering if this would be fast to implement: what about getting the mid point between positive peaks and negative peaks? Essentially, do what yo are doing with the blue line except have a blue line on the opposite side of the peak, then get a medium line.
wait... that just sounds like curve fitting. Why didn't curve fitting work for this?
hmmm...in the case of beating, some intermediate line between the peaks and the troughs actually makes more sense than a line through the peaks....curve fitting....hmmm - you would first have to decide what kind of curve it should be - for example, some polynomial, that fits the whole curve globally in a least squares sense? ...or maybe using segments and fit curves locally - as in splines?
this peak algorithm might benefit from smoothing with splines or something, might be better than straight curve fitting after being smoothed.
Since I've been away from this for a while could you give me a refresher on how I test this? Here's my code:
int fs = audioFile.getSampleRate();
int bd = audioFile.getBitDepth();
auto ptr = audioFile.getWritePointer(0);
int N = audioFile.getNumFrames();
vector<double> x; // input signal
x.reserve(N);
for (int s = 0; s < N; ++s)
x.push_back(ptr[s]);
RAPT::rsSinusoidalModel<double> sinusoidModel = getSinusoidModel(&x[0], N, fs, f0);
std::vector<double> y = synthesizeSinusoidal(sinusoidModel, fs);
// WRITE DEBEAT OUTPUT UNMODIFIED HERE
if (refPath.isNotEmpty())
{
AudioBuffer<float> audioBuffer(1, (int)y.size());
auto writePtr = audioBuffer.getArrayOfWritePointers();
for (int f = 0; f < int(y.size()); ++f)
writePtr[0][f] = float(y[f]);
Render::file(audioBuffer, File(refPath), fs, bd);
}
// mdl now contains the sinusoidal model for the sound. Now, we set up a rsPartialBeatingRemover
// and apply the de-beating to model data, then synthesize the de-beated signal and write into
// wavefile:
RAPT::rsPartialBeatingRemover<double> deBeater;
deBeater.setPhaseSmoothingParameters(5.0, 1, 4); // cutoff = 10 leaves a vibrato
deBeater.processModel(sinusoidModel);
y = synthesizeSinusoidal(sinusoidModel, fs);
// WRITE DEBEAT FINAL OUTPUT HERE
{
SimpleAudio audioBuffer(1, (int)y.size(), fs, bd);
for (int f = 0; f < int(y.size()); ++f)
audioBuffer.getWritePointer(0)[f] = float(y[f]);
Render::file(audioBuffer, newPath);
}
return var::undefined;
i've been away from this also, but i just revisited it again and i think, i'm getting better results now. there are a couple of new settings to play with:
deBeater.envExtractor.peakPicker.setShadowWidths(0.0, 0.5);
//deBeater.envExtractor.peakPicker.setWorkOnPeaksOnly(true); // default is false
deBeater.envExtractor.peakPicker.setNumNeighbors(3);
deBeater.envExtractor.setMaxSampleSpacing(0.5);
as for splines or smoothing - there is already the option to use splines, for example like so:
deBeater.envExtractor.setInterpolationMode(
rsInterpolatingFunction<double, double>::CUBIC_HERMITE);
but i have found this to produce bad overshooting artifacts. but maybe we could try running a smoothing filter over a linear interpolant
maybe we could try running a smoothing filter over a linear interpolant
i tried it - i get things like this:
it's smooth and doesn't suffer from overshoot - but doesn't pass through the datapoints. not sure, if that's desirable. note that the datapoints i took from the tail are purposefully coarsely sampled, so we can still see what the "densification" algo does. in practice, one would choose a higher datapoint density (ideally, something slightly above the beating period) and have a better represented tail section
green looks desirable.
Also, when two harmonics are beating, then the peak of that beating is going to be larger in amplitude than either harmonic, so passing through the data points is not necessarily desirable. But maybe not bad either. We just need to hear the end result to decide.
How do you stop the GNU plots showing up? I can't run the function because it gets halted by seemingly infinite GNU plots
How do you stop the GNU plots showing up? I can't run the function because it gets halted by seemingly infinite GNU plots
oh - yes - this is because i have injected plotting commands into the code to see whats going on during development - because this is still under development and actually not yet ready for general use. just comment these out. to find them, you can set some debug breakpoints in the functions in rapt/Basics/Plotting.h/cpp and track the calls down via the debugger's function callstack. i guess, i should provide a global, library-wide switch (with a #define or something) to turn plotting on and off
yea, ok i commented it out. Here's before and after debeating. Here's an example where a harmonic detection algorithm doesn't take into account where the harmonic should end. Also, there's still a gap or two... strange... Still testing more samples to see if we're getting somewhere. The debeat algorithm is slow as s*** at least in debug mode.
Ok, the results are getting better. The only issue is the lack of an end point for harmonics.
...aaand the amplitude gaps. i see them in your example, too (2nd harmonic towards the end). end-point may (hopefully) be a simple post-processing step - i imagine to just run through all the partials after analysis and search for a time-instant after which the partial remains below a certain (relative) threshold or something. but these gaps are a really pesky issue. i'm still struggling with the implementation of the dolph-chebychev window. it's a bitch. all the formulas i find online or in books just don't seem to work. fortunately, i just found working code for it in the scipy codebase - have to reverse engineer it....
So, there's a CRAP ton of squiggles. Could we try maybe a function of "if squiggly [if frequency is inconsistent between frames] for more than X frames, end the harmonic". Also, it would be good to smooth out the frequency so it's less squiggly.
in the debeater, i'm currently facing the problem that the amplitude envelope may contain spurious peaks which a naive peak-finding algorithm (i.e. one which just picks out all values that are higher than its left and right neighbour) will include in its found peaks. these spurious, irrelevant peaks may mess up my meta-envelopes (envelope-of-envelope). this problem of separating the relevant from the irrelevant peaks is actually a quite common problem:
https://www.google.com/search?&q=finding+relevant+peaks&oq=finding+relevant+peaks
...which is actually totally unsurprising, once one thinks about it. the tail-extender uses the criterion, that a peak must be higher than two neighbours to each side (for finding spectral peaks corresponding to partials). i considered generalizing the idea to consider nL neighbours to the left and nR neighbours to the right (in some settings, it may make sense to use asymmetric neighborhoods). that strategy would perhaps help in the debeater - but actually, it seems quite ad-hoc and crude. sifting through the google search results, i came across two ideas, that seem to be a bit more principled and justifiable - this:
https://en.wikipedia.org/wiki/Topographic_prominence
and this:
https://books.google.de/books?id=8h-ZDgAAQBAJ&pg=PT788&lpg=PT788&dq=finding+relevant+peaks&source=bl&ots=ZmIkfb9Zl6&sig=ACfU3U1vLoeM7mPahOTov618ijcnCDTh1w&hl=en&sa=X&ved=2ahUKEwja5Myu2bPmAhViQUEAHZjiAqQQ6AEwB3oECAkQAQ#v=onepage&q=finding%20relevant%20peaks&f=false
a robust peak finding algorithm is actually something, that may be useful in various other contexts as well - for example, in spectral envelope estimation.
...tbc...