grame-cncm / faust

Functional programming language for signal processing and sound synthesis
http://faust.grame.fr
Other
2.58k stars 322 forks source link

supercollider UGen based on recent faust master uses 3-times more CPU than version from 2011 #32

Closed LFSaw closed 7 years ago

LFSaw commented 7 years ago

( still on the JPverb issue... (see #31 ) )

As stated by several people, the current .dsp>>.cpp compilation of JPVerb for supercollider results in an increase of factor 3 for cpu usage. Being formerly in the range of 10%, this means a rather substantial 30% now... We do not have any idea on how to fix this, maybe someone here has an idea?

The files to compare are here: https://github.com/supercollider/sc3-plugins/pull/143/files

Fixes I made in JPVerbRaw.dsp and GreyholeRaw.dsp are:

compiled with current faust master a302793 instead of the version from 2011. Unfortunately, I did not know about submodules back then (and still don't), so all I know is that the faust includes in https://github.com/supercollider/sc3-plugins/tree/master/source/DEINDUGens/include/faust were the ones originally used back then. Maybe we can find out which faust commit it was based on?


related sc3-plugins thread: https://github.com/supercollider/sc3-plugins/pull/143

LFSaw commented 7 years ago

I am quite sure 9ca14ea was already in it...

sletz commented 7 years ago

The generated cpp code is quite long and complex... not too surprised of the CPU use of the resulting binary then. It is a bit difficult for me to understand where to get the correct versions of original .dsp and .cpp, as well as current .dsp and .cpp. Can you send me privately all those files? Thanks.

LFSaw commented 7 years ago

cpu use is fine (and understandable) for the 10%, what we are wondering about is, why the increase to 30% happens with a recent build. the files are here:

https://gist.github.com/LFSaw/17a6125b113cd420eaed104a10653b20

(you can download/clone that gist...)

LFSaw commented 7 years ago

oh, and thanks for looking into this!

sletz commented 7 years ago

Can you possibly add a version of the new dsp code with the jprev.h based model?

Thanks.

LFSaw commented 7 years ago

done (for JPVerb)

sletz commented 7 years ago

The size of the generated cpp is significantly different between older and newest version of the compiler for those DSP examples, which can explain the CPU difference. We are investigating the reason for that... stay tuned !

sletz commented 7 years ago

https://github.com/grame-cncm/faust/commit/23413d6ce18a8b408ed512270b4a179252e87833 shoud solve the issue. Can you test and report?

Note that the new version using waveform is a bit slower than "jprev.h" based one, but in a more limited way. We'll have to work on that at a later time.

LFSaw commented 7 years ago

tested and can confirm to being back to 10% CPU. Thanks for this, once again :) I also reverted the inline definition to use jprev.h again (see https://github.com/supercollider/sc3-plugins/pull/143/commits/e61d83fd632eba011443e08bd38d5a64d9a72f01).

I would hereby consider this issue closed.

LFSaw commented 7 years ago

it seems the issue got resolved by using CMAKE_BUILD_TYPE=Release (in SuperCollider flags)

sletz commented 7 years ago

Checking again the generated c++ code , I see that it has some differences with the previous one, due to differences in fdelay4 implementation in the new Faust libraries. This may explain some CPU difference, but not this high. Well'll do some more comparaison/measurement between old and new version of fdelay4.

rmichon commented 7 years ago

Mmmmh, in theory, the implementation of fdelay4 hasn't changed (I just did a quick check to be sure)... The only significant difference comes from the implementation of the sine wave oscillator which is now based on a resonant filter instead of a wave table, but the difference of CPU cost between these 2 methods should be really small.

On Thu, May 11, 2017 at 8:42 PM, Stéphane Letz notifications@github.com wrote:

Checking again the generated c++ code , I see that it has some differences with the previous one, due to differences in fdelay4 implementation in the new Faust libraries. This may explain of of CPU difference, but not this high. Well'll do some more comparaison/measurement between old and new version of *fdelay4**.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/grame-cncm/faust/issues/32#issuecomment-300881023, or mute the thread https://github.com/notifications/unsubscribe-auth/ADDaEGbXyZos_1Fv60w-xtt8Wvnd_vv9ks5r41argaJpZM4NU7v6 .

--

Romain Michon (+1)(650)646-8917http://ccrma.stanford.edu/~rmichon

sletz commented 7 years ago

Well the fdelay4 version we speak about was the one used with faust 0.9.62. It changed later on, before the "Faust libraries big rework" of summer 2016 ((-;

rmichon commented 7 years ago

Yes, but the source code of fdelay4 hasn't changed. In both cases it is based on fdelayltv(4) which is also the same:

The "new" implementation is here: https://github.com/grame-cncm/faust/blob/master-dev/libraries/delays.lib

And the old one (just do a search fdelay4): https://github.com/grame-cncm/faust/blob/master-dev/libraries/old/filter.lib

On Fri, May 12, 2017 at 6:03 PM, Stéphane Letz notifications@github.com wrote:

Well the fdelay4 version we speak about was the one used with faust 0.9.62. It changed later on, before the "Faust libraries big rework" of summer 2016 ((-;

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/grame-cncm/faust/issues/32#issuecomment-301117462, or mute the thread https://github.com/notifications/unsubscribe-auth/ADDaEMPJ5_ZrNfrDhwqRJ8Gv5GFwvk92ks5r5ILMgaJpZM4NU7v6 .

--

Romain Michon (+1)(650)646-8917http://ccrma.stanford.edu/~rmichon

sletz commented 7 years ago

The fdelay4 in Faust 0.9.65 was the following : // fourth-order (quartic) case, delay in [1.5,2.5] // delay d should be at least 1.5 fdelay4(n,d,x) = delay(n,id,x) fdm1fdm2fdm3fdm4/24

rmichon commented 7 years ago

Aaaaah interesting... So I think Julius might have changed the implementation of fdelay4 before we made the new library system which is why the "new" version of fdelay4 is the same both in the "old deprecated" libs and the new ones. I cc'ed Julius in case he's not looking at this thread.

On Sat, May 13, 2017 at 9:35 AM, Stéphane Letz notifications@github.com wrote:

The fdelay4 in Faust 0.9.65 was the following : // fourth-order (quartic) case, delay in [1.5,2.5] // delay d should be at least 1.5 fdelay4(n,d,x) = delay(n,id,x) fdm1fdm2fdm3 fdm4/24 + delay(n,id+1,x) (0-fdfdm2fdm3fdm4)/6

  • delay(n,id+2,x) fdfdm1fdm3 fdm4/4 + delay(n,id+3,x) (0-fdfdm1fdm2fdm4)/6
  • delay(n,id+4,x) fdfdm1fdm2fdm3/24 with { //v1: o = 1; o = 1.49999; dmo = d - o; // assumed nonnegative id = int(dmo); fd = o + frac(dmo); fdm1 = fd-1; fdm2 = fd-2; fdm3 = fd-3; fdm4 = fd-4; };

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/grame-cncm/faust/issues/32#issuecomment-301231706, or mute the thread https://github.com/notifications/unsubscribe-auth/ADDaEBD51Tnc-PQswbrDN4jh1lXY5Ni7ks5r5V0rgaJpZM4NU7v6 .

--

Romain Michon (+1)(650)646-8917http://ccrma.stanford.edu/~rmichon

sletz commented 7 years ago

Yes exactly ((-; If Julius could explain the reason this would be perfect !

josmithiii commented 7 years ago

On Sat, May 13, 2017 at 12:42 AM, Stéphane Letz notifications@github.com wrote:

Yes exactly ((-; If Julius could explain the reason this would be perfect !

Hey what's up?

As I recall, I changed the default Lagrange fdelays at some point to work properly when the delay is rapidly changing, which is more expensive (fdelayltv). If the delay is sitting still or changing slowly, you can get by with the less expensive fdelaylti.

So, "ltv" means "linear time-varying" while "lti" means "linear time-invariant".

I think the default should be "ltv" so that things always work. Of course, if the compiler is willing to figure out how fast a parameter can change, then the best version could be chosen based on what can actually happen (sounds hard).

For this thread, I'd try myfdelay4 = fdelaylti(4) and see if it solves the problem without getting into trouble.

--

Julius O. Smith III jos@ccrma.stanford.edu Professor of Music and, by courtesy, Electrical Engineering CCRMA, Stanford University http://ccrma.stanford.edu/~jos/

sletz commented 7 years ago

I did not remembered the details, thanks a lot Julius for explaining again!

"if the compiler is willing to figure out how fast a parameter can change", I don't think the complier can really do that...

To summarize this thread: it started when JPverb developers discovered a huge CPU increase. It appears to be due to a change we did in the Faust compiler itself... So we reverted back. After that, we saw this little change in the generated c++ code, that was caused by the library fdelay4 change. Now as you said myfdelay4 = fdelaylti(4) should be possibly tried.

rmichon commented 7 years ago

I think the default should be "ltv" so that things always work.

Yep, agreed... :)