mixxxdj / mixxx

Mixxx is Free DJ software that gives you everything you need to perform live mixes.
http://mixxx.org
Other
4.43k stars 1.27k forks source link

link against SSE enabled rubberband / soundtouch #7771

Open mixxxbot opened 2 years ago

mixxxbot commented 2 years ago

Reported by: daschuer Date: 2014-12-28T17:07:07Z Status: Confirmed Importance: Wishlist Launchpad Issue: lp1406117


It turns out, that Ubuntus rubberband and soundtouch are compiled for i386 without using the advantage of the fast sse registers. We should consider to provide a sse enabled version for these libaries, since they are mayor CPU eaters in the audio callback.

soundtouch uses just -O3 in Ubuntu Trusty rubberband uses just -O2 in Ubuntu Trusty

Adding -msse -mfpmath=sse will do the job.

mixxxbot commented 2 years ago

Commented by: ywwg Date: 2014-12-28T17:27:29Z


Note this comment block in rubberband:

"/* evaluation of 4 sines at onces, using only SSE1+MMX intrinsics so it runs also on old athlons XPs and the pentium III of your grand mother.

   The code is the exact rewriting of the cephes sinf function.
   Precision is excellent as long as x < 8192 (I did not bother to
   take into account the special handling they have for greater values
   -- it does not return garbage for arguments over 8192, though, but
   the extra precision is missing).

Note that it is such that sinf((float)M_PI) = 8.74e-8, which is the surprising but correct result.

Performance is also surprisingly good, 1.33 times faster than the macos vsinf SSE2 function, and 1.5 times faster than the __vrs4_sinf of amd's ACML (which is only available in 64 bits). Not too bad for an SSE1 function (with no special tuning) ! However the latter libraries probably have a much better handling of NaN, Inf, denormalized and other special arguments..

On my core 1 duo, the execution of this function takes approximately 95 cycles.

From what I have observed on the experiments with Intel AMath lib, switching to an SSE2 version would improve the perf by only 10%.

Since it is based on SSE intrinsics, it has to be compiled at -O2 to deliver full speed. */ "

mixxxbot commented 2 years ago

Commented by: daschuer Date: 2014-12-28T17:41:10Z


For me it looks like Ubuntu's rubberband version has no SSE1+MMX register enabled so it uses the slow i386 floating point unit. Any Idea how we can verify it?

mixxxbot commented 2 years ago

Commented by: rryan Date: 2015-02-26T02:05:00Z


This isn't a problem for Windows or Mac.

We don't control our Debian package so we can't actually fix this in Debian. Our Debian maintainer will strip out any bundling we attempt to do with RubberBand or SoundTouch.

Our PPA is the only place we can do something for this on Linux. If somebody wants to give me a shell script that produces a source package for librubberband or libsoundtouch then I can publish this to our PPA.

mixxxbot commented 2 years ago

Commented by: daschuer Date: 2015-02-26T07:19:05Z


Our Debian maintainer .. who is it? We may discuss a solution with him, something like providing an alternative package version. I think all Rubberband / SoundTouch dependants will benefit from it.