trummerschlunk / master_me

automatic mastering plugin for live streaming, podcasts and internet radio.
GNU General Public License v3.0
546 stars 22 forks source link

whole code only works with compiler option -double #4

Closed trummerschlunk closed 2 years ago

trummerschlunk commented 2 years ago

I am not 100% sure, but I guess it's a problem wit ma.EPSILON..? Does the -double option make the code more cpu-heavy?

With the need for '-double' I can't use the faustide.grame.fr which I find quite comfortable.

Which IDEs do you guys use? I went for ATOM and Faustlive. Not too slick...

jkbd commented 2 years ago

I use Emacs in a Terminal. I run a makefile that compiles the DSP for JACK. I also run faust2svg a lot, to look at the process graph with Firefox. Starting the DSP and connecting JACK-ports is not very slick, as well. Plus, I destroyed one speaker with loud white noise accidentially, due to auto-connection. Don't try this at home...

x42 commented 2 years ago

If this is really just about machine epsilon, and not some biquad coefficinets that are close to 1.0, you could just hardcode it to 2^-23 ~= 1.19e-07 -- that's -138 dB.

jkbd commented 2 years ago

I guess that this z = x / max(ma.EPSILON, y), we are talking about, is a trick to avoid division by zero. But isn't there another way?

magnetophon commented 2 years ago

There is, like @x42 just said: use a bigger value.

x42 commented 2 years ago

A common trick is to simply add a small value (also avoid denormals)

x42 commented 2 years ago

What happens if you compile this without --double-precision-floats ?

I've used faust2lv2 with the current code. That compiles, and basic listening tests are just fine.

trummerschlunk commented 2 years ago

I remember, z = x / max(ma.EPSILON, y) was to avoid division by zero.

trummerschlunk commented 2 years ago

I can confirm in faustide, without the -double option, the lufs meter falls to -69.9 at levels below -61db. With -double it's accurate in that range.

I remember, in faustide it was even below -41 or so.

trummerschlunk commented 2 years ago

lufs_meter.dsp now has a tone generator as input for verification.

trummerschlunk commented 2 years ago

I would just swap ma.EPSILON for 2^-23 ? (it is used six times)

x42 commented 2 years ago

When you look at the C++ code that FAUST generates, there are 3 problems

 FAUSTFLOAT(10.0f * std::log10(std::max<float>(1.1920929e-07f, fConst3 * (fRec2[IOTA0 & 1048575] - fRec2[(IOTA0 - iConst20) & 1048575]))) + -0.690999985f);

Ideally one would write this like

float integrated_signal  = [Sum all fRec2] / n_elements;
float level = 10.f * log10f (integrated_signal);

We cannot address the an.ms_envelope_rect which is implemented by a delayline FIFO. This will accumulate errors over time. We can however improve upon the signal power calculation with a simple trick log_base(x) = ln (x) / ln (base)

10 * log10(x)  = 10 * ln(x) / ln (10) = 4.342944819 * ln (x)

With this change, the LUFS meter is accurate down to about -100 LUFS using single-float maths. In theory this should even provide for a larger range, but the limiting factor is the ms_envelope_rect implementation

x42 commented 2 years ago

Sadly I was wrong. While this does improve the range a bit, the actual limitation is indeed by the ms_envelope_rect implementation. Say you add a value of 1.0 while at the end of the delayline (3s later) a value of 1e-7 pops out. 1.0 - .00000001 is still 1.0. Over time an error of FLT_EPSION will accumulate which corresponds to ~ -69.2dB

jkbd commented 2 years ago

FAUST an.ms_envelope_rect(Tg) uses ba.slidingMean, which indeed has a warning attached:

It will eventually run into numerical trouble when there is a persistent dc component. If that matters in your application, use the more CPU-intensive ba.slidingRMSp.

...which should refer to ba.slidingMeanp, I guess.

x42 commented 2 years ago

I'm just looking at (ba).slidingSump but it turns out you already found a proper ready to use implementation

jkbd commented 2 years ago

should refer to ba.slidingMeanp

I created grame-cncm/faustlibraries#142.

x42 commented 2 years ago

I did a quick check

my_sump(n) = ba.slidingSump(n, 576000)/n;
my_envelope(period, x) = x * x :  my_sump(rint(period * ma.SR));

and later zi = my_envelope(Tg);

I'm sure you can turn that into more elegant FAUST code though.

x42 commented 2 years ago

I was going to suggest to use a tree structure to subgroup partial sums and take the max (like x42 ebu meter does), but looking at the FAUST generated C++ code that is already done! The generated code does however not do this in a vectorizable loop.

Some performance measurements:

Old code using using an.ms_envelope_rect (1024 spp @ 48kHz):

image

with new code using slidingSump the worst case performance is 50% worse, but probably still reasonable.

image

in particular since in the big picture, looking at soundgood.dsp, other parts are more in need of optimization:

image

--

vs. handcrafted C code in x42 ebur128 :

image
x42 commented 2 years ago

@jkbd or @trummerschlunk here's my current diff for the LUFS meter. Your FAUST-fu is a lot better than mine. Could you please cleanly integrate this into soundgood and/or lib/ebur128.dsp using some declare and without a my_ prefix?

Note the constant 576000 = 3 seconds 48kHz 4 (max SR = 192000 kHz) -- perhaps declare this explicitly or set it depending on SR at instantiation time.

diff --git a/lufs_meter.dsp b/lufs_meter.dsp
index 903c19d..996ad1c 100644
--- a/lufs_meter.dsp
+++ b/lufs_meter.dsp
@@ -32,19 +32,23 @@ process =

 // tone_generator
 tone_generator = os.osc(f) * g <: _,_ with{
-  g = vslider("tone_gen_gain",-50, -100,0,1):ba.db2linear;
+  g = vslider("tone_gen_gain",-50, -120,0,1):ba.db2linear;
   f = vslider("tone_gen_freq[unit:Hz] [scale:log]",1000,20,20000,1);
 };

+my_sump(n) = ba.slidingSump(n, 576000)/n;
+
+my_envelope(period, x) = x * x :  my_sump(rint(period * ma.SR));
+
 // +++++++++++++++++++++++++ LUFS METER +++++++++++++++++++++++++

-lk2 = par(i,2,kfilter : zi) :> 4.342944819 * log(max(1e-10)) : -(0.691) with {
+lk2 = par(i,2,kfilter : zi) :> 4.342944819 * log(max(1e-12)) : -(0.691) with {
   //Tg = 0.4; // 3 second window for 'short-term' measurement
   Tg = 3;
-  zi = an.ms_envelope_rect(Tg); // mean square: average power = energy/Tg = integral of squared signal / Tg
+  zi = my_envelope(Tg); // mean square: average power = energy/Tg = integral of squared signal / Tg

   kfilter = ebu.prefilter;
 };

-lufs_meter(l,r) = l,r <: l, attach(r, (lk2 : vbargraph("[unit:dB]out-lufs-s",-100,0))) : _,_;
+lufs_meter(l,r) = l,r <: l, attach(r, (lk2 : vbargraph("[unit:dB]out-lufs-s",-120,0))) : _,_;
magnetophon commented 2 years ago

I was going to suggest to use a tree structure to subgroup partial sums and take the max (like x42 ebu meter does), but looking at the FAUST generated C++ code that is already done!

Good to see that a function I added to the library is useful here. I wrote it because I needed a highly performant sliding minimum with variable block size, for a limiter. It was quite a challenge for me to make a binary tree and line up the different parts of the tree correctly for a blocksize that can change at runtime. I was stumped to see that unlike a sliding min, a sliding sum can be implemented with not much more than a simple delay line.

The generated code does however not do this in a vectorizable loop.

Did you try compiling with -vec or --vectorize? It's supposed to generate easier to vectorize code. I'd try it out myself if I was able to recognize vectorizable C++.

magnetophon commented 2 years ago

Could you please cleanly integrate this into soundgood and/or lib/ebur128.dsp using some declare and without a my_ prefix?

@x42 How's this?

Note the constant 576000 = 3 seconds 48kHz 4 (max SR = 192000 kHz) -- perhaps declare this explicitly or set it depending on SR at instantiation time.

Faust need to know this value at compile time.

trummerschlunk commented 2 years ago

I guess, there is no need for -double anymore...? ;)

magnetophon commented 2 years ago

I guess, there is no need for -double anymore...? ;)

Looks like it! :rocket:

x42 commented 2 years ago

Are there any other places that also require double precision or benefit from that?

sletz commented 2 years ago

Does this helps: https://faustdoc.grame.fr/tutorials/summation/ ?