RobinSchmidt / RS-MET

Codebase for RS-MET products (Robin Schmidt's Music Engineering Tools)
Other
56 stars 6 forks source link

Need amplitude matching algorithm ASAP #293

Open elanhickler opened 4 years ago

elanhickler commented 4 years ago

Given waveform A and waveform B, what is the negative time offset for B in which A and B share common amplitude?

waveform A and B image

example time offset for match image

Add it to my tab, can you get this done within 7 days? Thank you! Edit: Actually I will pay for this work right away.

RobinSchmidt commented 4 years ago

how do I get the two envelopes for the inputs? Can I leave that blank?

blank? how are you supposed to compute an output value without input data? no, you need to input the two envelopes. to get them, you may use any of our envelope-detector classes (how did you do that so far? this aspect has not changed). i added the decimation feature. the function is now called rsEnvelopeMatchOffset. there are still some things, that i need to improve and clean up, though

edit: i added some documentation. as for the envelope follower: i think, a simple, low-quality one should be sufficient for this. due to the decimation, there's no point to use our super-smooth HQ mastering-compressor-grade envelope detector for this. i'm still using naive decimation without pre-filtering. i'll improve that tomorrow

elanhickler commented 4 years ago

bah getting bad results. Please demonstrate that this is actually working with real world samples. I could be doing something wrong.

RobinSchmidt commented 4 years ago

ok - first real world test over here - again with 42/31 pair (here, the raw amplitude is plotted - no decibels): image

zoomed in:

image

and with the formerly problematic 42/34 pair (somewhat zoomed in):

image

RobinSchmidt commented 4 years ago

the second one with the two bumps in the shiftee could perhaps be a bit more to the left...i have some additional ideas how to make it better.....but i think, it's not that bad for a first real-world test.

elanhickler commented 4 years ago

Could you post your code? Here's mine.

auto& take1 = ItemList[i].getActiveTake();
auto& take2 = ItemList[i + 1].getActiveTake();

AUDIOPROCESS::loadTake(take1);
AUDIOPROCESS::loadTake(take2);

double sampleRate = take1.getSampleRate();
double frequency = MIDI(take1.getTag("n")).getFrequency();
double cycleTimeMS = 1.0 / frequency * 1000.0;
double samplesPerCycle = 1.0 / frequency * sampleRate;

vector<double> envelopeData1;
envelopeData1.reserve(take1.getNumFrames());
vector<double> envelopeData2;
envelopeData2.reserve(take2.getNumFrames());

RAPT::rsEnvelopeFollower<double, double> enveloper;
enveloper.setSampleRate(sampleRate);
enveloper.setAttackTime(cycleTimeMS);
enveloper.setReleaseTime(cycleTimeMS);

for (int i = 0; i < take1.getNumFrames(); ++i)
    envelopeData1.push_back(enveloper.getSample(take1.getAudioChannel(take1.getFirstChannel())[i]));

enveloper.reset();

for (int i = 0; i < take2.getNumFrames(); ++i)
    envelopeData2.push_back(enveloper.getSample(take2.getAudioChannel(take2.getFirstChannel())[i]));

double offset = RAPT::rsEnvelopeMatchOffset<double>(take1.getAudioChannel(take1.getFirstChannel()).data(), take1.getNumFrames(), take2.getAudioChannel(take2.getFirstChannel()).data(), take2.getNumFrames(), samplesPerCycle);
offset /= sampleRate;

ITEM(take1.getMediaItemPtr()).setPosition(ITEM(take2.getMediaItemPtr()).getStart() + offset);
RobinSchmidt commented 4 years ago
void testEnvelopeMatching2(std::vector<double>& x1, std::vector<double>& x2)
{
  typedef std::vector<double> Vec;

  // exctract envelopes:
  rsEnvelopeFollower<double, double> ef;
  ef.setSampleRate(44100);  // make this a function parameter
  ef.setAttackTime(0.0);    // in ms?
  ef.setReleaseTime(200.0);
  Vec e1(x1.size()), e2(x2.size());
  int n;
  for(n = 0; n < (int )x1.size(); n++) e1[n] = ef.getSample(x1[n]);
  ef.reset();
  for(n = 0; n < (int )x2.size(); n++) e2[n] = ef.getSample(x2[n]);

  // find match offset:
  double dt = 0;
  int D = 30;  // decimation factor
  dt = rsEnvelopeMatchOffset(&e1[0], (int) e1.size(), &e2[0], (int) e2.size(), D);

  // create the two time axes and decimated enveloeps for plotting (using the same decimation
  // factor as for matching):
  Vec e1d = rsDecimateViaMean(e1, D);
  Vec e2d = rsDecimateViaMean(e2, D);  
  Vec t1d(e1d.size()), t2d(e2d.size());
  for(n = 0; n < t1d.size(); n++)  t1d[n] = n * D;
  for(n = 0; n < t2d.size(); n++)  t2d[n] = n * D + dt;

  // plot:
  GNUPlotter plt;
  plt.addDataArrays((int) t1d.size(), &t1d[0], &e1d[0]);
  plt.addDataArrays((int) t2d.size(), &t2d[0], &e2d[0]);
  plt.plot();
}
elanhickler commented 4 years ago

soooo does my code look good? I notice you are using 0 attack. Also, I need you to test multiple examples. I need proof that your algorithm is reaaallllyyy working. Edit: oh you got two there.

RobinSchmidt commented 4 years ago

soooo does my code look good?

look ok to me. at least, i can't see any obvious errors. but yeah - maybe try to tweak the envelope follower time constants. ...but when thinking about it, a cycle for attack and release is probably ok. having equal values for attack and release basically turns the envelope follower into lowpass-of-abs (not necessarily a bad thing, just for info).

maybe try less decimation? (like half a cycle, quarter, ..)...but this is just guessing. i actually think, one full cycle should be optimal. (my fixed value of 30 is actually less than a cycle - a cycle is roughly 100 samples long in these examples - but i tried with 100 and got similar results)

elanhickler commented 4 years ago

please test 36, 37, 39 with 42. It's not working. I'm getting good results with the ones you tested.

RobinSchmidt commented 4 years ago

36: image 37: image 39: image

RobinSchmidt commented 4 years ago

...decimation was set to 100, in these cases

elanhickler commented 4 years ago

not sure if those are actually correct, need to see the full waveform. I'm coding something where I can see the envelope and waveform for myself...

elanhickler commented 4 years ago

with attack the result is a lot less accurate in terms of envelope matching the original

image

without attack:

image

but this does not seem to affect results with the files I was using. The results are bad (but it looks like the algorithm is working as intended):

image

this is what I want:

image

elanhickler commented 4 years ago

Here are more examples, actual result vs desired result

The first example is weird because it seems to be more arbitrary than the other examples. If your algorithm messes up 10% of the time that's ok. The rest of the examples seem to follow a logic of "where it overlaps the most"

image

image

image

image

image

image

RobinSchmidt commented 4 years ago

with attack the result is a lot less accurate

you mean, when you use a nonzero attack setting for the envelope follower, or when the original envelope actually has an attack?

...anyway, i think, it should probably merge the algorithm into the old class, so we may use all these "ignore" facilities there, too. ...such that the algorithm may ignore the attack section in the matching process

oh - i guess, i see now what you mean - you mean the systematic under-estimation of the actual envelope, when there's a non-zero attack?

but this does not seem to affect results

oh yes - this sort of systematic underestimation is the same everywhere (given that you use the same settings for both signals)

RobinSchmidt commented 4 years ago

Here are more examples, actual result vs desired result

hmm - the last three, 43/36, 42/37 and 42/39 are the pairs that i tried and posted myself above - and i seem to get different results. what numbers do you get for the shifts with these?

edit: i get: 33781.446011870466, 28849.226104615831, 27732.343100411363 (attack: 0, release: 200, decimation: 100)

RobinSchmidt commented 4 years ago

rest of the examples seem to follow a logic of "where it overlaps the most"

the logic is: where the sum of the absolute differences attains a minimum

edit: this "sum-of-absolute-differences" is what i call the "dissimilarity function" in this context (it's a function of the shift). we can try different dissimilarity functions - or similarity functions (such as cross-correlation) and then look for the maximum. but that's the basic idea: define a suitable (dis)similarity function -> compute it for the two envelopes at hand -> find the minimum (or maximum)

elanhickler commented 4 years ago

im going to try the new debeating algorithm. In the mean time could you try one more attempt at the envelope matching? I could opt to do everything by hand if your next attempt doesn't work.

RobinSchmidt commented 4 years ago

yes - but i somehow need to reproduce your bad results to see where it goes wrong. as said - my results seem to be different - as in "better". as said, for the 42/39 pair, where you got this result:

image

my result was this:

image

which looks actually ok. .....soo, it seems we must be using different settings or something

elanhickler commented 4 years ago

I don't know what to do from here. I'm using the same settings you are.

RobinSchmidt commented 4 years ago

hmm - maybe instead of me trying to reproduce your results, you could try to reproduce my results. grab the latest version of the test repo (together with the latest version of the main rs-met repo) and run this test:

image

you should get the same plot, i posted above (but zoomed out). you could then use the debugger to figure out, where the differences in our code paths are

RobinSchmidt commented 4 years ago

btw. - if you didn't already install gnuplot by itself, you don't have to, because it also comes packaged with gnu octave (assuming you did not uninstall it again). you would just need to set:

gnuplotPath = "C:/Octave/Octave-5.1.0.0/mingw64/bin/gnuplot.exe";

in

...\RS-MET\Libraries\RobsJuceModules\rapt\Basics\GNUPlotter.cpp

...right in the constructor of the GNUPlotter class. i assume that you installed octave into the default installation path - otherwise, change the path string appropriately

elanhickler commented 4 years ago

do you get different results whether you put 42/39 as x1/x2 vs x2/x1?

RobinSchmidt commented 4 years ago

ooh - yes - when switching the inputs, i get a garbage result: image ...the process is not symmetric/commutative (edit: hmm....maybe it should be?). btw. i just added the samples to the repo. so far i just had them locally, but missed to add them to the repo (sorry for the inconvenience - i thought they were already in the repo). btw. if there are any issues regarding copyright stuff with adding the samples to the repo, just tell me and i will remove them. it's just convenient to have them in the test repo