grame-cncm / faust

Functional programming language for signal processing and sound synthesis
http://faust.grame.fr
Other
2.59k stars 325 forks source link

Interpreter: Bad performance for this instrument #988

Closed kmatheussen closed 10 months ago

kmatheussen commented 10 months ago

Tested Faust version: 2.54.9 Sorry for the noise if this has been fixed in newer version of Faust.

Steps to reproduce:

  1. Download and run latest demo version of Radium: https://users.notam02.no/~kjetism/radium/downloaddemo_withoutsubscribing.php
  2. OpenProjects -> Demo Songs -> Poptikus
  3. Click on "10: Chorus 1" in the right side of the screen. I.e. open block 10.
  4. Move the editor-cursor to track 7: "Voice"
  5. Play block (right shift)
  6. Open the "Instrument" tab in the bottom of the screen.
  7. Click the "Interp." button to switch from the LLVM backend to the Interpreter backend.
  8. Notice CPU usage going up a lot. On my laptop from 2013, this instrument is now using around 150% CPU.

Ref: https://github.com/kmatheussen/radium/pull/1410

Or: Try this instrument:

import("stdfaust.lib");

// MIDI parameters 
freq = nentry("freq",200,40,2000,0.01);
bend = nentry("bend",1,0,10,0.01);
gain = nentry("gain",1,0,1,0.01);
gate = button("gate");

// Envelope
env = en.adsr(0.1,0.2,0.4,0.4,gate);

// Vibrato
vibratoFreq = 5;
vibratoGain = 0.03;
vibrato = os.osc(vibratoFreq)*vibratoGain + 1;

// Vowel Change
vowelFreqL = 0.4;
vowelGainL = 0.4;
vowelL = os.osc(vowelFreqL)*vowelGainL + 0.5;
vowelFreqR = 0.5;
vowelGainR = 0.5;
vowelR = os.osc(vowelFreqR)*vowelGainR + 0.5;

// Generator
genL = pm.SFFormantModelBP(0,vowelL,0,freq*bend*vibrato,gain);
genR = pm.SFFormantModelBP(0,vowelR,0,freq*bend*vibrato+0.5,gain);

process = genL*env, genR*env;
sletz commented 10 months ago

Yes indeed, heavy code even with LLVM.

kmatheussen commented 10 months ago

I guess I should have written how much worse it is. On an M3 macbook, this instrument uses around 2% CPU for LLVM, and around 80% CPU for the interpreter. The interpreter is using a lot more CPU than usual. Usually the interpreter is maybe 3-4X slower, for this instrument it's around 40X slower.

kmatheussen commented 10 months ago

Maybe the interpreter is getting exponentially slower compared to LLVM the bigger the instruments are?

kmatheussen commented 10 months ago

I guess a more probable reason is that various overhead make it seem like there is a low difference in CPU between LLVM and interpreter for lighter instruments, while for heavier instruments the difference in performance between LLVM and the interpreter is much more notable.

sletz commented 10 months ago

On Intel 2,2 GHz Intel Core i7:

faustbench-llvm-interp radium.dsp Libfaust version : 2.70.3 (LLVM 16.0.2) DSP inputs = 0 outputs = 2 Duration 0.566683 Duration 26.618201 Result LLVM : 14.1065 Interpreter : 0.311295 ratio : 45.3155

On M1: Libfaust version : 2.70.3 (LLVM 18.0.0git) DSP inputs = 0 outputs = 2 Duration 0.257206 Duration 12.439307 Result LLVM : 31.5661 Interpreter : 0.655039 ratio : 48.1896

kmatheussen commented 10 months ago

Maybe overall cache-usage can explain the bigger difference I saw in Radium. Anyway, I guess this type of difference in performance is common... I just had gotten the wrong impression of the efficiency of the interpreter because I hadn't thought about the baseline overhead when measuring CPU usage for instruments in Radium.

sletz commented 10 months ago

And a possible complementary explanation is that LLVM auto-vectorisation (so generating SIMD code) seems to be improved with recent versions, so raising the LLVM/Interp ratio.