Closed RobinSchmidt closed 2 years ago
it looks like the period of the amplitude modulation is 10 = 1/(f2-f1) = 1/0.1 - and this is indeed the perceived frequency of the amplitude modulation - but looks and auditory perception are deceiving in this case: the actual period of repetition is 20 - because the (blue) "modulator"'s period is 20, its frequency is - as as said - (f1-f2)/2 and not just f2-f1. when looking more closely at the green result, we see that every other bump/blob/grain - however you will call it - is sign-inverted.
in my sinusoidal analysis/resynthesis framework, amplitudes are non-negative numbers - so when we see analysis data of two beating sinusoids there, it will be modeled as a combination of a rectified-sine amplitude modulation together with some periodic switching of the phase between 0 and 180 degrees. so, in order to remove beating, (in the sense of trying to "recover" the "carrier" signal - which actually, in a sense, never really existed in reality (*)), it's not enough to flatten out the rectified-sine amp-envelope - but we must also somehow account for this periodic phase-inversion ...tbc...
() or did the carrier (and modulator) actually exist? i guess that's a question best left to the philosophers. haha. but i would take the position that - when the production mechanism of the sound created two sines and added them up - then the carrier wave did not really exist. we postulate* such a model in which there was a carrier that was ring-modulated by some modulator - and we want to undo that postulated ring-modulation - which never actually took place in the real world. ...a kinda strange situation indeed :-)
here are all signals shown in a single plot: black: f1, blue: f2, red: (f1+f2)/2 - "carrier", purple: (f1-f2)/2 - "modulator", green: sum or product (black + blue = red * purple): ...a bit busy, but we can see the phase-relationships between any pair
here is a theoretically idealized version of what the analysis/resynthesis system is supposed to do, when it would try to model the signal - it would have this rectified-sine amplitude envelope and the phase-switches would be interpreted as part of the "carrier" wave:
in order to remove the beating, we would replace the rectified-sine envelope with its envelope - a sort envelope-of-envelope or "meta-envelope" - that problem, i think, i have solved. but it still tries to model these phase switches - this is the problem, i need to solve next
ok - so much for the theory. now, let's see what the code actually does. here is a pair of sine waves at 995Hz and 1005Hz at 44.1 kHz sample-rate: www.rs-met.com/sounds/resynthesis/DeBeat2SinesInput.wav
this is the sound from resynthesizing it without any modification of the model data: www.rs-met.com/sounds/resynthesis/DeBeat2SinesOutputUnmodified.wav ...pretty close, the artifacts at start and end are to be expected due to the harsh (and unnatural) onset of the original - the model always applies a sort of fade-in/out (perhaps i should make that optional?), the amp-env sounds a bit less smooth than in the original, though...hmm...don't know why that is...hmmm
....and now (....drum roll....) - here is the result from the current (preliminary) "de-beating" implementation - at the moment - just removes the amplitude envelope: www.rs-met.com/sounds/resynthesis/DeBeat2SinesOutput.wav the amp-envelope is now dead-flat - as it should be (this is the part of the problem, i consider more or less solved) - but we can clearly hear these clickery clackery phase jumps where previously were minima in the amp-envelope. they are not as harsh as in the theoretical prediction above - because the system switches the phase not within a single sample-instant but within one cycle (because that's the spacing of the analysis datapoints)
the amp-env sounds a bit less smooth than in the original, though...hmm...don't know why that is...hmmm
actually, it sounds a bit like a strongly attenuated (and maybe lowpassed?) version of the clickery - which makes a lot of sense, now that i think about it. the phase discontinuities are there in this signal too - they just happen to be attenuated by the amp env. so maybe it will go away when i solve the phase inversion problem
let's have a look at the analysis data of the instantaneous phase. but plotting raw phase-data makes not much sense - it would be hard to interpret. first, we need to unwrap it (add an appropriate multiple of 2*pi to each raw datapoint (y-value) to make the whole function continuous) - that will give a function that is some sort of straight upward line, the slope of which is determined by the frequency. in such a plot, the details that we are interested in would be totally buried and masked by that linear trend. so, to take out the linear trend, we can take the (numerical) derivative - this will give (something proportional to) the measured instantaneous frequency. this has been done in this plot:
another thing we may do is to remove the linear trend by applying linear regression and subtracting the regression line. this has been done here:
in both cases, we see "discontinuities" - sharp spikes in the first, vertical edges in the second plot. these are the phase discontinuities that we need to get rid of. the theory actually predicted a square-wave but what we see is a sawtooth. this is probably an artifact of the analysis. i have some gut feeling that something like that should be expected but i didn't yet think much about that. anyway - sawtooth or square - we need to get rid of these discontinuities.....
...one obvious thing to do would be to apply a (bi-directional) lowpass filter to the (differentiated or de-trended) phase data. one could also just replace the whole sawtooth by its average value - but that would probably not be such a good idea, if - in addition to these phase-jumps due to beating - there is a legitimate frequency modulation (vibrato) going on as well. we don't want to remove such a legitimate vibrato, if present, so lowpass it is
but we cannot use a regular lowpass here, because our data is not - in general - sampled equidistantly. the sampling interval of the analysis datapoints is subject to some sort of "jitter" determined by the cycle-marks of the input signal (we have one datapoint between each pair of cycle-marks). the really correct way to handle that is to use a filter for non-uniformly sampled data. ...i guess, as a first approximation, we can just ignore this and treat the data as if it were uniformly sampled - if that seems to be a promising route, the non-uniform filter can come later as refinement. i already did some preliminary research into this subject the last few days - but this topic deserves it's own thread....
i just discovered an inetersting new service/feature - i can write ipython notebooks and publish them in my repo and others can view them - even with interactive elements:
..it takes a while until it's ready, but when it is ready, one can click the "run" button. then a plot should appear at the bottom with some widgets to tweak parameters (sometimes it appears only on tweaking the slider). you can tweak the initial phase difference and switch on/off certain signals in the plot. it's not exactly responsive but better than nothing (with a local python/notebook/jupyter installation, the responsiveness is unfortunately just as bad). a static, non-interactive view is also available on github itself:
https://github.com/RobinSchmidt/RS-MET-Research/blob/master/Notebooks/SineBeating.ipynb
...i'll probably use such things more often in the future for R'n'D and demonstrations
ok - this is what i get when i take the de-trended phase and apply a lowpass to it (treating it, as if it were uniformly sampled):
www.rs-met.com/sounds/resynthesis/DeBeat2SinesOutput2.wav
...this is already quite convincing aside from the pesky artifact at the beginning - which, i think, may be due to treating the signal like a uniformly sampled one (the distance between the first and second datapoint happens to be much shorter than the distance between all others which may well have messed up the filtering process). i could throw away that datapoint, treating it as spurious - but i hope, a non-uniform filter will handle this situation just fine
Wow, like you said, apart from that artifact in the beginning, this is really promising!
oookay...i updated the link: www.rs-met.com/sounds/resynthesis/DeBeat2SinesOutput2.wav the artifact is gone. now i have a lot of messy prototype code that needs to be cleaned up, made conveniently available form client code and i will have to implement a bunch of generalizations and optimizations for the non-uniform filters - but the general idea seems to work! ...and the non-uniform filters will probably become an important new tool in my toolbox to process the sinusoidal model data.
the class rsPartialBeatingRemover can already be used - but it's still in a very preliminary state. and i really need to do some experiments on real-world data next....
What's your status on this? Should I use it in its current state?
i just added some user options to the class rsPartialBeatingRemover (the setPhaseSmoothingParameters
function). before, these settings were hard wired. as it is now, i think, it should be reasonable to use it, when you restrict yourself to using a filter order of 1 (and possibly using a higher "numPasses" value). the reason is that the current implementation of the bidirectional filter is a butterworth filter and higher order butterworths may have overshoot - which we probably don't want in this context. see also my comments in harmonicDeBeating1
in the "TestsRosicAndRapt" project
i want to switch to using Bessel- or Gaussian filters later - or make it a user option. i need to fix some cutoff normalization problem with the gaussian filter - when done, i can also finally integrate it into EngineersFilter in ToolChain - something i wanted to do since quite a while already
it's interesting to note that, if you set up the cutoff frequency of the smoothing filter too high, you will end up with an output signal that features a vibrato. that's not surprising because we are smoothing out a discontinuous phase-modulation with a rectangular wave (or sawtooth - still not quite sure why the analysis results differ from the prediction) - and what remains is a sinusoidal phase modulation, if the filter lets the fundamental of the phase-modulator pass. only when the cutoff is low enough to block even the fundamental of the phase-modulator, we will see a sine with a more or less stable frequency
the DeBeat2SinesOutput2.wav is a little dirty, you might need to clean up the results with a filter.
Can you write some instructions on how to use the test project to debeat a sine wave wav file? This is confusing:
I expect to see something like:
load input data function set parameters function get output function
i have added a demo function to the test repo. https://github.com/RobinSchmidt/RS-MET-Tests it loads a wavefile, de-beats it and writes the result out to a wavefile
void testDeBeating(const std::string& name, std::vector<double>& x, double fs, double f0)
{
int N = (int) x.size(); // x is the input signal
// create and set up analyzer and obtain sinusoidal model:
RAPT::rsHarmonicAnalyzer<double> analyzer;
analyzer.setSampleRate(fs);
analyzer.setSpectralOversampling(4);
analyzer.setNumCyclesPerBlock(4);
analyzer.setWindowType(stringToWindowType("hm")); // options: rc,hn,hm,bm,bh
analyzer.getCycleFinder().setFundamental(f0);
RAPT::rsSinusoidalModel<double> mdl = analyzer.analyze(&x[0], N);
// mdl now contains the sinusoidal model for the sound. Now, we set up a rsPartialBeatingRemover
// and apply the de-beating to model data:
rsPartialBeatingRemover<double> deBeater;
deBeater.setPhaseSmoothingParameters(5.0, 1, 4); // cutoff = 10 leaves a vibrato
deBeater.processModel(mdl);
// mdl now contains the modified, de-beated model data - synthesize the signal from the model and
// write it into a wave file (for convenient reference, also write the input next to it):
std::vector<double> y = synthesizeSinusoidal(mdl, fs);
rosic::writeToMonoWaveFile("DeBeatOutput.wav", &y[0], (int)y.size(), (int)fs);
rosic::writeToMonoWaveFile("DeBeatInput.wav", &x[0], (int)x.size(), (int)fs);
// ...we should probably use the passed name somehow for naming the files
}
the DeBeat2SinesOutput2.wav is a little dirty, you might need to clean up the results with a filter.
hmm - interesting. i need to think about what may be the reason. the harmonics seems to get reduced, when lowering the cutoff of the phase-smoothing filter. this is done in
deBeater.setPhaseSmoothingParameters(5.0, 1, 4);
the first parameter is the cutoff, the second is the filter order and the third is the number of (bidirectional) passes - so you'll notice that i'm currently just setting it to a first order filter (butterworth - but for 1st order, it doesn't matter - they are all the same - the differences appear at higher orders). i'll try to get the gaussian design to work tomorrow
...btw. - if you like, we may meet on gmail soon
Well, the debeating is promising. I think we should move forward. The next step is to allow me to select a range of frequencies to debeat, and leaveas much of the original content untouched.
As you can see, there's many harmonics that were missed. We need more sensitive settings, and it needs to grab the harmonics only, not the surrounding noise content (doesn't have to perfect). In this example, the pitch was also an octave higher at the start, then went down to the desierd pitch. Going to retry with a specified frequency.
Edit: I tried specifying a frequency, same behavior.
whoa - the analysis results look really bad. but i would suspect this to be more like a problem with the analysis, not the de-beating. did you try to resynthesize it without modification? and/or tweak the analyzer settings? should i try this myself with the problematic sample?
Its the Elan/Rhodes_F2 sample. I dont see what settings i would adjust. I expect to see a sensitivity setting or threshold setting
Its the Elan/Rhodes_F2 sample
where? in my tests repo, i only have an Elan/Rhodes_F3.wav
That's what I meamt
We've been using the C3 middle C standard for our naming, but these Rhodes samples are based on C4 as middle C. Any chance that might have something to do with this issue?
ok - it's definitely the analysis going berzerk. i'm investigating. i also noticed that my non-uniform filters take quite a long time to compute. i have an idea for a different implementation that might be more efficient - but first let's see what's wrong with the analysis here
We've been using the C3 middle C standard for our naming, but these Rhodes samples are based on C4 as middle C. Any chance that might have something to do with this issue?
yes, probably, if you have renamed the sample after i've included it into my test data set
ok - it seems, i have to refine my envelope-of-envelope extraction. simply connecting local maxima of the 1st order envelope doesn't cut it. here is the amp-env of the 6th partial of the rhodes (black: original, blue: after de-beating): in the left, attack section, it does what it should but toward the right, there are not enough local maxima - there's only one at the far right which gets connected to the left section via a straight line. maybe i need some kind of minimum density of amplitude data-points in order to take data-points in the middle, even if there aren't any local maxima
in the left, attack section, it does what it should
...or - well - one could also perhaps want an envelope that doesn't connect the peaks but instead somehow tracks the local average
Hey Robin, can you create a function so that I can remove a range of frequencies spectrally?
I need to split audio files from 0 to 150hz and 150hz to the upper limit (half samplerate), spectrally (I'm bringing them back together after a process, so combining them needs to produce the original signal)
Need this as soon as possible, I'm sure you already have the facilities, perhaps you just need to tell me what functions to use.
as soon as you have the rsSinusoidalModel object in hand (from analyzing an audio file), you can remove partials via the member functions removePartial, keepOnly, removePartialsAbove, removePartialsWithMeanFreqAbove
. if you need additional functions to remove/keep partials by other cirteria, just tell me, what criteria you need, and i can add functions for that - or you loop through the partials yourself and investigate them one by one (via getPartial
) and in your loop, build up a list of partials to keep and then use that list with keepOnly ...oh - i guess, i should probably make a function removePartials that takes a list of indices to remove
getPartial(size_t index)
returns you a reference to the respective partial and on that returned reference, you can inquire infos about the partial - for example with functions like getMinFreq, getMaxFreq, getMeanFreq
....i think, i'll add a pair functions getLowpassPart(splitFreq), getHighpassPart(splitFreq)
ok - done - but i added the functions a static functions to the class rsSinusoidalProcessor. they can be used like this:
double splitFreq = 1000;
RAPT::rsSinusoidalModel<double> lp = rsSinusoidalProcessor<double>::extractLowpassPart(mdl, splitFreq);
RAPT::rsSinusoidalModel<double> hp = rsSinusoidalProcessor<double>::extractHighpassPart(mdl, splitFreq);
where mdl
is the rsSinusoidalModel object
ok, this is just creating a highpass/lowpass filter? This latest request isn't about harmonic extraction. What is a sinusoidalmodel?
the rsSinusoidalModel class is the main data-structure that is used in sinusoidal analysis/resynthesis framework. it's the sinusoidal representation of the sound. i thought, you wanted to split the audio on the level of that data-structure. but if this is not the case and you just need a pair of lowpass/highpass filters with prefect reconstruction properties, you can just use a bidirectional lowpass filter and obtain the highpass part by time-domain subtraction
Surely you have functions to do this in the frequency domain? I want to get used to using frequency domain functions. Better start with this easy one.
you mean, you want to zero out frequency bins in the spectrogram representation? yes, sure, this should be easy to do. i guess, it would make sense, if i add a class rsSpectrogramProcessor (analogous to rsSinusoidalProcessor) to collect functions that manipulate spectrograms
Giving me easy access to manipulate bin volumes could be the start of creating denoisers and other spectral processes that could eventually lead to a sellable product.
First step: simple highpass/lowpass filter (zero-out bins above or below a frequency).
for the most basic denoiser:
to make the denoiser better:
rsSpectrogram has now functions lowpass
and highpass
. the spectrogram is just a matrix of complex numbers, so accessing a particular value is just a matter of writing s(i, j)
where i is the frame-index and j is the bin index (and s is the spectrogram matrix itself). there's also a little helper function frequencyToBinIndex
to help you convert between between physical frequencies and bin indices
Could I have a usage example starting with an audio buffer or sample array? then back to a sample array.
search for spectrogramSine in the TestsRosicAndRapt project
I'm getting lots of command windows opening and closing. you have GNU plotting code in the debeating functions. Could that be the cause? The windows seem to open and close forever, not sure if it will ever conclude.
this line specifically: deBeater.processModel(mdl);
Edit: I'm implementing into SALT so I can use the scripting engine to control it.
ok i commented out the gnu plotting stuff. Results are not horrible!
Main difference is the more linear decay of the harmonics (should be more exponential), and alsoooo the harmonics end too abruptly.
Also, I need to choose a range of frequencies to debeat, I don't want the entire thing to be affected.
Another thing, the transient/attack is gone. The hammery sound is gone.
Actually, the upper half of the harmonics look good.
I can basically fix the transient by doing a little crossfade with the original:
I can take the bottom half of ORIGINAL frequencies + the upper half of DEBEAT frequencies and get this:
http://www.elanhickler.com/transfer/Rhodes_Original.wav http://www.elanhickler.com/transfer/Rhodes_DebeatAndEdited.wav
This algorithm takes an insane amount of time to do one file. Can you improve it?
How could this happen? Causing crash:
I can't get through even a few files without a crash. You'll have to run the samples for yourself and fix the issue.
Frequencies: F#3: 369.99442271163439955 F3: 349.22823143300399806 G#3: 415.30469757994524116 G3: 391.99543598174932413
Samples: http://www.elanhickler.com/transfer/Rhodes_NeedDebeat_samples.zip (20 mins to upload)
whoa! why so many samples? one that exposes the problem would be sufficient. i (randomly) picked this one "Rhodes Tuned F3 V12TX -16.4 10-17-16.wav" ...yes...it exposes the crash. somehow, the cycle-mark finder seems to mess up and find cycles of 1 sample length. hmmm...
I sent them to you so that once you fix the issue for one sample, you have more samples to verify that there isn't more issues.
i think, i should fix this other problem first: when the envelope shows beating in the initial section and no beating in the final section - which is typical for the rhodes samples, it messes up: ...it doesn't find maxima to connect in the right half anymore. i need some strategy to create datapoints regardless of whether there are maxima or not. i currently just connect the peaks - which works only, if there actually are peaks but messes up in sections where there are no peaks anymore. seems like it goes into linear extrapolation mode then. i didn't think of that situation. just connecting the peaks is not good enough
you still working on solving the crash?
yeah - i didn't get much done recently because i'm grappling with a weird health issue. a tingly numbness in both legs, from the hips downward. like this weird feeling when you hit the elbow in a bad way. ...lots of doctor appointments and stuff. tomorrow, i'll get my results from a blood-test and MRI-scan
as everybody knows, when we add two sinusoids of frequencies that are close to each other, we get a periodic pattern of constructive and destructive interference between the two sines and the end result is a signal that looks like a sinewave with a sinusoidal amplitude modulation imposed on it. here are two sinewaves of frequencies f1 = 0.95 (black) and f2 = 1.05 (blue) and their sum (green):
and we see the typical beating pattern in the sum at the difference frequency f2-f1. via the trigonometric identity
sin(a) + sin(b) = 2 * sin((a+b)/2) * cos((a-b)/2)
, this sum-of-sines can be re-expressed as a product-of-sines - one at the average frequency (f1+f2)/2 (half-the-sum, = 1 in the example), seen as "carrier" (shown below in black) and one at half-the-difference frequency (f1-f2)/2, seen as modulator (blue):in this case, their product is shown in green and this green signal is exactly the same as the green signal in the first plot. ...tbc...