grame-cncm / faust

Functional programming language for signal processing and sound synthesis
http://faust.grame.fr
Other
2.59k stars 325 forks source link

Crackling when compiling with -vec -vc 16, but not without the -vc 16 #186

Open eulervoid opened 6 years ago

eulervoid commented 6 years ago

Hello, so i've somehow managed to write a filter that crackles when compiled with -vec -vc 16, but runs fine without the -vc 16 or without vectorisation at all. Could this be a bug on my part? Anyone have any ideas what could cause this? I basically ported the Moog LadderFilter from the juce::dsp module to Faust.

sletz commented 6 years ago

Which audio platform are you testing on? Is the Faust DSP code available visible somewhere ?

eulervoid commented 6 years ago

I'm running the DSP module inside Juce program with a custom wrapper on macOS. The code looks like this:

declare name "LadderFilter";
import("stdfaust.lib");

mode  = int(hslider("filter/mode", 0, 0, 1, 1));
slope = int(hslider("filter/slope", 0, 0, 1, 1));
type = mode+slope+(mode*(1+slope));

A(i) = ((0, 0, 1, 1) : ba.selectn(4, type)),
       ((0, 0,-2,-4) : ba.selectn(4, type)),
       ((1, 0, 1, 6) : ba.selectn(4, type)),
       ((0, 0, 0,-4) : ba.selectn(4, type)),
       ((0, 1, 0, 1) : ba.selectn(4, type)) :
       ba.selectn(5, i) : *(outputGain) with { outputGain = .9; };

comp = .5, 1 : select2(mode);

cutoffFreqHz = hslider("filter/freq[map:log]", .8, 20, 20000, 0.0001) : si.smoo;
cutoffFreqScaler = (-2.0 * ma.PI) / ma.SR;
cutoffTransformValue = exp (cutoffFreqHz * cutoffFreqScaler);

resonance = hslider("filter/res", .5, .1, 1, 0.0001) : si.smoo;

drive = hslider("filter/drive", 1, 1, 10, 0.0001) : si.smoo;
gain = pow(drive, -2.642)   * 0.6103 + 0.3903;
drive2 = drive              * 0.04   + 0.96;
gain2 = pow(drive2, -2.642) * 0.6103 + 0.3903;

mix = hslider("filter/mix", .5, 0, 1, 0.0001) : si.smoo;

core(s0, s1, s2, s3, s4, in) = a, b, c, d, e, (a*A(0) + b*A(1) + c*A(2) + d*A(3) + e*A(4))
with {
  a1 = cutoffTransformValue;
  g =  a1 * (-1) + 1;
  b0 = g * 0.76923076923;
  b1 = g * 0.23076923076;

  dx = gain * ma.tanh (drive * in);
  a = dx + resonance * (-4) * (gain2 * ma.tanh (drive2 * s4) - dx * comp);

  b = b1 * s0 + a1 * s1 + b0 * a;
  c = b1 * s1 + a1 * s2 + b0 * b;
  d = b1 * s2 + a1 * s3 + b0 * c;
  e = b1 * s3 + a1 * s4 + b0 * d;
};

process = _, _ : *(1-mix), *(mix) :> (core ~ (par(i, 5, _), !) : (!, !, !, !, !, _));
sletz commented 6 years ago

Its does not compile : "30 : ERROR : undefined symbol : aA"

Have you tried with another audio architecture?

sletz commented 6 years ago

-vc : you mean -vs right ?

eulervoid commented 6 years ago

sorry, thanks for the effort in trying figure this out! the code was not displayed as code and therefore a*A(0) got somehow converted to aA. i edited the post and now it compiles. and i mean -vs, yes. i'm calling faust like this: faust $file -vec -vs 16 -lv 0 -ftz 2

sletz commented 6 years ago

I don't hear any crackles testing with faust2jaqt here, with -vec -vs 16.

Why do you add the -ftz 2 mode? This mode is only useful when deploying on an architecture without hardware FTZ (Flush To Zero), like WebAssembly.

I would advise to use faust2plot of faust2octave to print the actual output samples:

https://github.com/grame-cncm/faust/tree/master-dev/tools/faust2appls

josmithiii commented 6 years ago

It sounds like the problem is simple underrun. The -vec option can make certain programs slower. As far as I know, you simply have to try it and see.

On Sun, May 27, 2018 at 2:26 PM Stéphane Letz notifications@github.com wrote:

I don't hear any crackles testing with faust2jaqt here, with -vec -vs 16.

Why do you add the -ftz 2 mode? This mode is only useful when deploying on an architecture without hardware FTZ (Flush To Zero), like WebAssembly.

I would advise to use faust2plot of faust2octave to print the actual output samples:

https://github.com/grame-cncm/faust/tree/master-dev/tools/faust2appls

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/grame-cncm/faust/issues/186#issuecomment-392374498, or mute the thread https://github.com/notifications/unsubscribe-auth/ACGVFbQ4gydHzbwCMj6xkz9XugQGGZSDks5t2xnugaJpZM4UPYzE .

--

Julius O. Smith III jos@ccrma.stanford.edu Professor of Music and, by courtesy, Electrical Engineering CCRMA, Stanford University http://ccrma.stanford.edu/~jos/ http://ccrma.stanford.edu/

eulervoid commented 6 years ago

hm, a simple underrun doesn't seem too likely to me. although i found out that it works when i give my synth only one voice, which could support that theory. but my cpu is far from overloaded and even at very big buffer sizes, the crackling still occurs. i ran faustbench on the filter alone and on the whole synth voice and for both processors it says that -vec -lv 0 -vs 16 is the best setting. hm. so probably the bug is in my wrapping code, since it didn't crackle for sletz with the same settings. still i'm very curious why it would crackle with some settings, and not with others. are the parameters like hslider etc. updated in a different manner depending on these settings? thanks for the responses so far, i really appreciate it! PS: is this also the place for asking questions? or is there a forum or something similar? this is kind of off topic now but my question would be: is faust multirate by now? it seems so since fft is implemented, but i can't really find anything about that in the docs. i'm particularly interested in implementing oversampling, which would probably make the filter sound better.

sletz commented 6 years ago

1) I was testing LadderFilter without any "synth", basically sending a sound in the faust2jaqt compiled LadderFilter DSP (using JACK) 2) how do you do the "synth" ? Using Faust polyphonic architecture code ? or JUCE one? 3) control parameter (sliders....) value are "sampled" one per block and their values are maintained the entire block. So buffer-size of the running audio layer may change this aspect; although not depending of -vec compiled code or not (since the produced samples inside a given block are supposed to be exactly the same) 4) Faust is not yet multirate

eulervoid commented 6 years ago

I think i'm on to something. I'm using the Juce MPESynthesizer class to manage polyphony and each voice contains an instance of the faust dsp object. The issues seems to be related to the way Juce handles the voices. Whenever a voice starts to play, the voices renderNextBlock method gets called with startSample and numSamples which describe the the buffer region that should be rendered. I'm then shifting the pointer for fausts compute method to the startSample position and call it with numSamples as the block size. i'm not sure why, but this causes issues when i use the vectorisation mode. with vector size 16, it crackles all the time. size 32 is fine, but it clicks whenever i start or release a voice. does the vector size need to be a related to the block size somehow maybe? Still curious that this only occurs with this filter and with none of my other processors..

sletz commented 6 years ago
eulervoid commented 6 years ago

numSamples would usually be blockSize-startSample, or in rare cases it could also be something else i think. it depends on the midi events that are coming in.

MPESynthesizer is a standard juce class that i am using as is: https://docs.juce.com/master/classMPESynthesiser.html https://github.com/WeAreROLI/JUCE/tree/master/modules/juce_audio_basics/mpe

The voices render function looks like this:

renderNextBlock(AudioBuffer<float> &buffer, int startSample, int numSamples) {
    // only stereo processing is supported
    jassert(buffer.getNumChannels() == 2);

    dsp::AudioBlock<float> block(buffer);
    // whats called dsp here is the faust dsp wrapper object
    dsp.process(startSample, numSamples);
    block.getSubBlock(startSample, numSamples).add(dsp.getOutputBuffer().getSubBlock(startSample, numSamples));
}

basically it renders the desired region and adds it to the buffer. my faust wrappers render method looks like this:

void setFaustPointers(std::vector<FAUSTFLOAT*>& faustPointers, int startSample, dsp::AudioBlock<FAUSTFLOAT> &buffer) {
    jassert(startSample < buffer.getNumSamples());
    jassert(faustPointers.size() >= buffer.getNumChannels());
    dsp::AudioBlock<FAUSTFLOAT> section = buffer.getSubBlock(startSample);
    for (int c = 0; c < section.getNumChannels(); ++c)
        faustPointers[c] = section.getChannelPointer (c);
}

virtual dsp::AudioBlock<FAUSTFLOAT> process(int startSample, int numSamples) {
    jassert(dsp->getNumInputs() == 0); // this is the function for no input processors
    setFaustPointers(faustOutputPointers, startSample, this->outputBuffer);
    dsp->compute(numSamples, faustInputPointers.data(), faustOutputPointers.data());
}

setFaustPointers() shifts the starting position in the buffer to startSample and creates a Faust-compatible pointer from the dsp::AudioBlock

eulervoid commented 6 years ago

i think i probably have to make sure that rendering always starts and ends at a multiple of the vector size, always rendering blocks that can be wholly divided into N vector operations. there is a method called setMinimumRenderingSubdivisionSize() in the juce synthesizer class which didn't help but maybe thats because it only limits the size of the chunks, not the possible starting and ending positions.

sletz commented 6 years ago
sletz commented 6 years ago

"i think i probably have to make sure that rendering always starts and ends at a multiple of the vector size, " not sure this is the issue, please test first what I'm suggesting.

eulervoid commented 6 years ago

ok, i've tried your suggetions. -vec -lv 1 -vs 16 doesn't really make a difference. and yes, numsamples goes below 16, here a recored example (blockSize is 256):

startSample: 0
numSamples: 9
startSample: 9
numSamples: 247

i think this only occurs with notes beeing started and stopped though. otherwise its (0, 256). but even with synth.setMinimumRenderingSubdivisionSize(16, true);, numsamples drops below 16 sometimes, which should not happen normally..

sletz commented 6 years ago

OK, it seems like a code generation bug... Then I'll have to improve our code generation testing tools to check arbitrary buffer size computation.

eulervoid commented 6 years ago

Ok, i'll stick to -scal for now then. Thanks for your help!

sletz commented 6 years ago

I've tested all our tests DSP examples, calling the compute method by slices to reproduce your use case. I cannot reproduce a similar problem: generated samples have the same values.

AFAICS the only way to move on is to get a reduced case that shows the problem. Can you possibly prepare that? Thanks.

eulervoid commented 6 years ago

hey, sorry, this problem was not at the top of my priority list since it works with -scal for now.. but i will try to prepare a reduced case as soon as i find the time!

sletz commented 5 years ago

This may be fixed with 2.15.11.