grame-cncm / faust

Functional programming language for signal processing and sound synthesis
http://faust.grame.fr
Other
2.58k stars 322 forks source link

dlang backend: misuse of the static keyword #530

Closed jpcima closed 3 years ago

jpcima commented 3 years ago

In D, the equivalent keyword to static for variables is __gshared. The variables marked with static keyword are thread-locals.

References:

static is generated here: https://github.com/grame-cncm/faust/blob/c1f62de986bc91cf6f66188e53734a391c416150/compiler/generator/dlang/dlang_instructions.hh#L201-L203

sletz commented 3 years ago

So we should generate this ?

if (inst->fAddress->getAccess() & Address::kStaticStruct) { 
     *fOut << "__gshared "; 
} 
jpcima commented 3 years ago

Most likely, I would rather let the D experts comment on this one.

abaga129 commented 3 years ago

Ah yes I think gshared is prefered here but there are some caveats when using it on some OSs though. I believe in linux gshared variables get shared between instances.

p0nce commented 3 years ago

I wrote this a while ago: https://p0nce.github.io/d-idioms/#Four-ways-to-use-the-static-keyword-you-may-not-know-about

p0nce commented 3 years ago

Anyone has an example of D code output by Faust to check correctness?

p0nce commented 3 years ago

Two caveats indeed: __gshared will be shared across two dynlib invocation under POSIX / macOS. This typically cause issues in plugins. Also TLS variables in D are a "good idea" that is often used unwillingly, creating problems with callbacks, and also require the D runtime to have been initialized. They should be avoided at all cost with the -vtsl D compiler switch to be compatible with Dplug. So... it's better if there is no global variables there and instead allocate a chunk of memory at initialization time.

sletz commented 3 years ago

faust -lang dlang osc.dsp

/+ dub.sdl:
    name "mydsp"
    dependency "dplug:core" version="*"
+/
/* ------------------------------------------------------------
author: "Grame"
copyright: "(c)GRAME 2009"
license: "BSD"
name: "osc"
version: "1.0"
Code generated with Faust 2.30.2 (https://faust.grame.fr)
Compilation options: -lang dlang -es 1 -scal -ftz 0
------------------------------------------------------------ */
module mydsp;
import std.math;
import std.algorithm : min, max;
import dplug.core.nogc: mallocNew, mallocSlice, destroyFree, assumeNothrowNoGC;

struct mydspSIG0 {
nothrow:
@nogc:

  private:

    int[] iRec0 = new int[2];

  public:

    int getNumInputsmydspSIG0() nothrow @nogc {
        return 0;
    }
    int getNumOutputsmydspSIG0() nothrow @nogc {
        return 1;
    }
    int getInputRatemydspSIG0(int channel) nothrow @nogc {
        int rate;
        switch ((channel)) {
            default: {
                rate = -1;
                break;
            }
        }
        return rate;
    }
    int getOutputRatemydspSIG0(int channel) nothrow @nogc {
        int rate;
        switch ((channel)) {
            case 0: {
                rate = 0;
                break;
            }
            default: {
                rate = -1;
                break;
            }
        }
        return rate;
    }

    void instanceInitmydspSIG0(int sample_rate) nothrow @nogc {
        for (int l0 = 0; (l0 < 2); l0 = (l0 + 1)) {
            iRec0[l0] = 0;
        }
    }

    void fillmydspSIG0(int count, float[] table) nothrow @nogc {
        for (int i = 0; (i < count); i = (i + 1)) {
            iRec0[0] = (iRec0[1] + 1);
            table[i] = sin((9.58738019e-05 * cast(float)((iRec0[0] + -1))));
            iRec0[1] = iRec0[0];
        }
    }

};

mydspSIG0* newmydspSIG0() nothrow @nogc { return assumeNothrowNoGC(&mallocNew!(mydspSIG0))(); }
void deletemydspSIG0(mydspSIG0* dsp) nothrow @nogc { destroyFree(dsp); }
alias FAUSTCLASS = mydsp;

__gshared float[65536] ftbl0mydspSIG0;

class mydsp : dsp
{
nothrow:
@nogc:

 private:

    float fHslider0;
    int fSampleRate;
    float fConst0;
    float fHslider1;
    float[] fRec1 = new float[2];
    float fHslider2;
    float[] fRec2 = new float[2];

 public:

    void metadata(Meta* m) { 
        m.declare("author", "Grame");
        m.declare("basics.lib/name", "Faust Basic Element Library");
        m.declare("basics.lib/version", "0.1");
        m.declare("compile_options", "-lang dlang -es 1 -scal -ftz 0");
        m.declare("copyright", "(c)GRAME 2009");
        m.declare("filename", "osc.dsp");
        m.declare("license", "BSD");
        m.declare("maths.lib/author", "GRAME");
        m.declare("maths.lib/copyright", "GRAME");
        m.declare("maths.lib/license", "LGPL with exception");
        m.declare("maths.lib/name", "Faust Math Library");
        m.declare("maths.lib/version", "2.3");
        m.declare("name", "osc");
        m.declare("oscillators.lib/name", "Faust Oscillator Library");
        m.declare("oscillators.lib/version", "0.1");
        m.declare("platform.lib/name", "Generic Platform Library");
        m.declare("platform.lib/version", "0.1");
        m.declare("version", "1.0");
    }

    int getNumInputs() nothrow @nogc {
        return 0;
    }
    int getNumOutputs() nothrow @nogc {
        return 2;
    }
    int getInputRate(int channel) nothrow @nogc {
        int rate;
        switch ((channel)) {
            default: {
                rate = -1;
                break;
            }
        }
        return rate;
    }
    int getOutputRate(int channel) nothrow @nogc {
        int rate;
        switch ((channel)) {
            case 0: {
                rate = 1;
                break;
            }
            case 1: {
                rate = 1;
                break;
            }
            default: {
                rate = -1;
                break;
            }
        }
        return rate;
    }

    void classInit(int sample_rate) {
        mydspSIG0* sig0 = newmydspSIG0();
        sig0.instanceInitmydspSIG0(sample_rate);
        sig0.fillmydspSIG0(65536, ftbl0mydspSIG0);
        deletemydspSIG0(sig0);
    }

    void instanceConstants(int sample_rate) {
        fSampleRate = sample_rate;
        fConst0 = (1.0 / fmin(192000.0, fmax(1.0, cast(float)(fSampleRate))));
    }

    void instanceResetUserInterface() {
        fHslider0 = cast(float)(0.5);
        fHslider1 = cast(float)(500.0);
        fHslider2 = cast(float)(600.0);
    }

    void instanceClear() {
        for (int l1 = 0; (l1 < 2); l1 = (l1 + 1)) {
            fRec1[l1] = 0.0;
        }
        for (int l2 = 0; (l2 < 2); l2 = (l2 + 1)) {
            fRec2[l2] = 0.0;
        }
    }

    void initialize(int sample_rate) {
        classInit(sample_rate);
        instanceInit(sample_rate);
    }
    void instanceInit(int sample_rate) {
        instanceConstants(sample_rate);
        instanceResetUserInterface();
        instanceClear();
    }

    mydsp clone() {
        return (mallocNew!mydsp());
    }

    int getSampleRate() nothrow @nogc {
        return fSampleRate;
    }

    void buildUserInterface(UI* uiInterface) {
        uiInterface.openVerticalBox("osc");
        uiInterface.declare(&fHslider1, "unit", "Hz");
        uiInterface.addHorizontalSlider("freq1", &fHslider1, 500.0, 20.0, 8000.0, 1.0);
        uiInterface.declare(&fHslider2, "unit", "Hz");
        uiInterface.addHorizontalSlider("freq2", &fHslider2, 600.0, 20.0, 8000.0, 1.0);
        uiInterface.addHorizontalSlider("volume", &fHslider0, 0.5, 0.0, 1.0, 0.00999999978);
        uiInterface.closeBox();
    }

    void compute(int count, FAUSTFLOAT*[] inputs, FAUSTFLOAT*[] outputs) {
        float* output0 = outputs[0];
        float* output1 = outputs[1];
        float fSlow0 = cast(float)(fHslider0);
        float fSlow1 = (fConst0 * cast(float)(fHslider1));
        float fSlow2 = (fConst0 * cast(float)(fHslider2));
        for (int i = 0; (i < count); i = (i + 1)) {
            fRec1[0] = (fSlow1 + (fRec1[1] - floor((fSlow1 + fRec1[1]))));
            output0[i] = cast(float)((fSlow0 * ftbl0mydspSIG0[cast(int)((65536.0 * fRec1[0]))]));
            fRec2[0] = (fSlow2 + (fRec2[1] - floor((fSlow2 + fRec2[1]))));
            output1[i] = cast(float)((fSlow0 * ftbl0mydspSIG0[cast(int)((65536.0 * fRec2[0]))]));
            fRec1[1] = fRec1[0];
            fRec2[1] = fRec2[0];
        }
    }

}
p0nce commented 3 years ago

For ftbl0mydspSIG0:

But I think there is a problem with lines like for:

float[] fRec1 = new float[2];

what happens is that new float[2] is evaluated at compile-time with CTFE, and will point to (I believe) the same initializer for all instances. This is the reason while this line pass the @nogc attribute check.

Buffers like fRec1 should be created on a per-instance basis, using typically malloc/free or better dplug.core.vec.reallocBuffer. (EDIT: Faust can do better even by merging allocations of instance buffers I guess, this merged alloc strategy makes stuff very fast in my experience)

sletz commented 3 years ago
  • it seems it is the same data for every mydsp instance, in this case you can use __gshared. Seems correct to me. There is a race in case of multiple initialization at once, I haven't been able to find a workaround for this in years.

OK.

float[] fRec1 = new float[2];

what happens is that new float[2] is evaluated at compile-time with CTFE, and will point to (I believe) the same initializer for all instances. This is the reason while this line pass the @nogc attribute check.

Buffers like fRec1 should be created on a per-instance basis, using typically malloc/free or better dplug.core.vec.reallocBuffer.

Yes fRec1 is indeed on a per-instance basis. I'll check dplug.core.vec.reallocBuffer.

(EDIT: Faust can do better even by merging allocations of instance buffers I guess, this merged alloc strategy makes stuff very fast in my experience)

What do you mean by " Faust can do better even by merging allocations of instance buffers I guess" ?

jpcima commented 3 years ago

another problem noticed: classInit in D is not marked with the static keyword, that C++ has.

p0nce commented 3 years ago

What do you mean by " Faust can do better even by merging allocations of instance buffers I guess" ?

What I mean is that for example, an instance needs a buffer A of 1024 samples, a buffer B of 2048 samples, and a buffer C of 32 samples.

Then you can do a single malloc at once which will help with cache locality. (you probably already know about this).

When I was in the video encoding space, rearranging buffers like this were the only time we got a 11% speed-up on the whole application!

sletz commented 3 years ago

Well we tried different DSP struct fields (scalar and buffers) allocation strategies, but it seems the best (and natural in some sense) is just allocate fields in the order of the generated instructions that access them.

abaga129 commented 3 years ago

Sorry I've been away for quite some time. I've built faust with the latest on master-dev and ran faust2dplug on the pitchShifter example and tested it. I opened two instances of the plug-in in Reaper on linux and it seems to work exactly as expected. The two instances do not interfere with each other: at least not in any audible way that I can tell. I would say this issue looks resolved :smiley:

sletz commented 3 years ago

Thanks, closing.