Add Denormal prevention in engine code

mixxxbot commented 1 year ago

Reported by: daschuer Date: 2014-12-19T22:52:12Z Status: Fix Released Importance: Medium Launchpad Issue: lp1404401 Attachments: ftz_stats, ftz.patch

It looks like we need to add it, because processing denormals may cost 100 times more CPU.

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-19T22:53:13Z

Links from IRC: http://musicdsp.org/files/denormal.pdf http://ldesoras.free.fr/doc/articles/denormal-en.pdf

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-19T23:14:37Z

Possible solution: https://www.qt.gitorious.org/qt/qtwebengine-chromium/source/b68d7ce5e7e3466409c942c514064d45b31ee666:chromium/third_party/WebKit/Source/platform/audio/DenormalDisabler.h

mixxxbot commented 1 year ago

Commented by: ywwg Date: 2014-12-20T00:29:13Z

I removed out denormal code earlier this year because it caused horrible audible spikes in the EQ filters as the wave approached zero due to waveform discontinuity. Partially this was because our denormal code was way outside the limits of where this cpu penalty is actually applied. When RJ and I looked closer, we already use a gcc flag to disable ultra-small values anyway, so the compiler is denormaling for us.

mixxxbot commented 1 year ago

Commented by: ywwg Date: 2014-12-20T00:32:34Z

If you do want to pursue this change, I would require a battery of tests that show that none of our filters are thrown off by the change in the sound wave. But I would urge you to confirm that it's actually a problem by demonstrating the CPU impact first before writing a bunch of new code.

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-20T08:37:29Z

Can you recall which gcc Flag it is? I cannot find it.

http://frozenfractal.com/blog/2010/3/11/optimization-story/:

" Luckily, there is an instruction to change the CPU’s behaviour: instead of storing denormalized values, these can simply be flushed to 0. Unfortunately, there is no standard library function for this. On Visual C++, we can do this:

_controlfp(_MCW_DN, _DN_FLUSH);

On gcc, we need some inline assembly. This was my first x86 assembly ever:

int mxcsr;
__asm__("stmxcsr %0" : "=m"(mxcsr) : :);
mxcsr |= (1 << 15); // set bit 15: flush-to-zero mode
__asm__("ldmxcsr %0" : : "m"(mxcsr) :);
"

mixxxbot commented 1 year ago

Commented by: rryan Date: 2014-12-20T14:09:42Z

Hm, I had forgotten about that Owen:

The flag is -ffast-math -- according to the GCC docs here it enables flush-to-zero on some platforms though it isn't specific about which ones: https://gcc.gnu.org/wiki/FloatingPointMath

mixxxbot commented 1 year ago

Commented by: rryan Date: 2014-12-20T14:49:24Z Attachments: ftz_stats

On my MBP (x86_64) adding FTZ in an experiment didn't have much effect.

1) Turn off waveforms 2) Load song, wait for analysis to complete 3) adjust EQs to non-neutral 4) play song 5) wait 6) record 40 seconds of base 7) record 40 seconds of experiment

mixxxbot commented 1 year ago

Commented by: rryan Date: 2014-12-20T14:49:44Z Attachments: ftz.patch

Patch for the above experiment.

mixxxbot commented 1 year ago

Commented by: ywwg Date: 2014-12-20T16:06:58Z

Here's the other documentation I was using -- ffast-math + sse flags: http://carlh.net/plugins/denormals.php

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-22T21:38:17Z

the filter code suffers denormals. I have checked this by adding

    if (!std::isnormal(buf[3])) {
        qDebug() << "denormal";
    }

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-22T21:48:23Z

@RJ: Did you use a sse3 build? It might be possible that your code does nothing on default builds see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21408

mixxxbot commented 1 year ago

Commented by: ywwg Date: 2014-12-22T21:50:53Z

The last time someone tried to "fix" denormals, it caused major audio artifacts. If we're going to try to do this again, we're going to need:

proof that denormals are causing detectable CPU hits
proof that the fix does not cause audio artifacts.

This work should wait for post-release.

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-22T22:43:07Z

@Owen: What did you do last time to flush denormals?

I have just read that denormals are flushed by default on Mac Os audio callback. So it can't be that bad.
@RJ, can you verify that?

mixxxbot commented 1 year ago

Commented by: ywwg Date: 2014-12-22T22:58:41Z

The previous fix involved checking to see if the value was within abs(.000001) or some incredibly insufficiently small number, and just set the value to 0 if so It was a really bad fix that was not tested well

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-22T23:05:00Z

It look like we can rely on this: http://carlh.net/plugins/denormals.php

@RJ, what does it men for the default Mixxx optimization flags?

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-22T23:27:37Z

My results:

Thread model: posix gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) model name : Intel(R) Core(TM) i5-3317U CPU @ 1.70GHz model name : Intel(R) Core(TM) i5-3317U CPU @ 1.70GHz model name : Intel(R) Core(TM) i5-3317U CPU @ 1.70GHz model name : Intel(R) Core(TM) i5-3317U CPU @ 1.70GHz subnormal 12.0642 times slower. -ffast-math enabled subnormal 1.14059 times slower. SSE enabled FTZ=0 DAZ=0 subnormal 12.0405 times slower. SSE enabled FTZ=0 DAZ=1 subnormal 0.999144 times slower. SSE enabled FTZ=1 DAZ=0 subnormal 12.0839 times slower. SSE enabled FTZ=0 DAZ=0 -ffast-math enabled subnormal 1.00153 times slower. SSE enabled FTZ=0 DAZ=1 -ffast-math enabled subnormal 1.01593 times slower. SSE enabled FTZ=1 DAZ=0 -ffast-math enabled subnormal 1.0017 times slower.

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-23T06:59:07Z

hread-Modell: posix gcc-Version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) model name : Intel(R) Core(TM) i5 CPU M 560 @ 2.67GHz model name : Intel(R) Core(TM) i5 CPU M 560 @ 2.67GHz model name : Intel(R) Core(TM) i5 CPU M 560 @ 2.67GHz model name : Intel(R) Core(TM) i5 CPU M 560 @ 2.67GHz subnormal 12.0694 times slower. -ffast-math enabled subnormal 1.01273 times slower. SSE enabled FTZ=0 DAZ=0 subnormal 11.8485 times slower. SSE enabled FTZ=0 DAZ=1 subnormal 0.991461 times slower. SSE enabled FTZ=1 DAZ=0 subnormal 13.0207 times slower. SSE enabled FTZ=0 DAZ=0 -ffast-math enabled subnormal 0.96152 times slower. SSE enabled FTZ=0 DAZ=1 -ffast-math enabled subnormal 1.00095 times slower. SSE enabled FTZ=1 DAZ=0 -ffast-math enabled subnormal 1.04673 times slower.

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-23T13:45:18Z

On the same device, but a Virtual 32 bit OS:

Thread model: posix gcc version 4.3.1 20080507 (prerelease) [gcc-4_3-branch revision 135036] (SUSE Linux) model name : Intel(R) Core(TM) i5 CPU M 560 @ 2.67GHz subnormal 23.8905 times slower. -ffast-math enabled subnormal 24.4369 times slower. SSE enabled FTZ=0 DAZ=0 subnormal 10.5012 times slower. SSE enabled FTZ=0 DAZ=1 subnormal 0.958384 times slower. SSE enabled FTZ=1 DAZ=0 subnormal 11.4803 times slower. SSE enabled FTZ=0 DAZ=0 -ffast-math enabled subnormal 0.998909 times slower. SSE enabled FTZ=0 DAZ=1 -ffast-math enabled subnormal 0.990018 times slower. SSE enabled FTZ=1 DAZ=0 -ffast-math enabled subnormal 1.00707 times slower.

mixxxbot commented 1 year ago

Commented by: ywwg Date: 2014-12-23T13:54:04Z

I'm not quite sure what all the acronyms are, would you suggest a change in our build flags?

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-23T14:20:52Z

Probably yes. I am still missing a test on my 32 bit Atom Netbook too prove it. But I think we have already enough data for a conclusion.

1.) Since our Filters are Infinite, they will produce denominals. I have proved it by a test. 2.) Because of the -ffast-math flag Mixxx 64 bit builds, have no penalty by denormals. I have pr roved this on my devices and the cloumn at http://carlh.net/plugins/denormals.php is green for 64 bit CPUs and -ffast-math only. 3.) There is a performance penalty on 32 bit Mixxx builds even tough the -ffast-math flag is set. We need to enable sse and set the DAZ flag to have the same benefit as the 64 bit build.

I do not know, big the relation to the entire CPU time in the Audio callback is, (will test it later) but since we have a solution for 64 bit, we should solve the issue for 32 bit as well.

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-23T19:09:08Z

Thread model: posix gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) model name : Intel(R) Atom(TM) CPU N270 @ 1.60GHz model name : Intel(R) Atom(TM) CPU N270 @ 1.60GHz subnormal 45.0023 times slower. -ffast-math enabled subnormal 37.7372 times slower. SSE enabled FTZ=0 DAZ=0 subnormal 19.1181 times slower. SSE enabled FTZ=0 DAZ=1 subnormal 0.998239 times slower. SSE enabled FTZ=1 DAZ=0 subnormal 18.8671 times slower. SSE enabled FTZ=0 DAZ=0 -ffast-math enabled subnormal 0.999303 times slower. SSE enabled FTZ=0 DAZ=1 -ffast-math enabled subnormal 0.999885 times slower. SSE enabled FTZ=1 DAZ=0 -ffast-math enabled subnormal 0.998566 times slower.

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-24T10:05:31Z

Test results from my Atom Notebook

Build with default settings

Debug [Main]: 
Stat("LinkwitzRiley8EQEffect","count=2393,sum=7.27838e+09ns,average=3.04153e+06ns,min=910102ns,max=5.17381e+07ns,variance=5.35149e+13ns^2,stddev=7.31539e+06ns")
Debug [Main]: Stat("EngineMaster::process_duration","count=2678,average=3.44708e+06ns,min=207010ns,max=6.8616e+07ns,variance=5.4347e+13ns^2,stddev=7.37204e+06ns")

Build with scons -j2 optimize=2

Debug [Main]: Stat("LinkwitzRiley8EQEffect","count=2200,sum=3.39283e+09ns,average=1.5422e+06ns,min=699111ns,max=2.511e+07ns,variance=7.1921e+12ns^2,stddev=2.68181e+06ns")
Debug [Main]: Stat("EngineMaster::process_duration","count=2448,average=2.35673e+06ns,min=188362ns,max=2.64581e+07ns,variance=8.66118e+12ns^2,stddev=2.94299e+06ns")

The average time for the EQ is nearly the 1/2 of the non sse version. Interesting is that the max value is also doubled and not x 20 as we might expect by the denormals calulations.

Conclusion: There is a BIG benefit of SSE 32 bit builds. This should be the default for source builds.

For binary distributions, we should strongly consider to drop Pentium 3 support. .. or offer sse and non sse builds.

It might be a problem for the Linux distros to drop Pentium 3 :-/

mixxxbot commented 1 year ago

Commented by: ywwg Date: 2014-12-24T15:20:02Z

I would have no problem with dropping pentium 3. Even "old" netbooks are still going to have an Atom or Celeron or more modern CPU than a pentium 3.

mixxxbot commented 1 year ago

Commented by: ywwg Date: 2014-12-24T15:20:12Z

Thanks for doing this research, daniel!

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-25T18:50:41Z

It looks like DAZ is standard on armhf builds ...

https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html

" If the selected floating-point hardware includes the NEON extension (e.g. -mfpu=‘neon’), note that floating-point operations are not generated by GCC's auto-vectorization pass unless -funsafe-math-optimizations is also specified. This is because NEON hardware does not fully implement the IEEE 754 standard for floating-point arithmetic (in particular denormal values are treated as zero), so the use of NEON instructions may lead to a loss of precision. "

http://stackoverflow.com/questions/7346521/subnormal-ieee-754-floating-point-numbers-support-on-ios-arm-devices-iphone-4

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-26T00:16:50Z

This is the result on the same hardware as above using "optimize=2" and just -O2

Debug [Main]: Stat("LinkwitzRiley8EQEffect","count=4418,sum=2.54514e+10ns,average=5.76084e+06ns,min=743041ns,max=3.95725e+07ns,variance=8.41122e+13ns^2,stddev=9.17127e+06ns")
Debug [Main]: Debug [Main]: Stat("EngineMaster::process_duration","count=4727,average=6.27221e+06ns,min=201701ns,max=4.46743e+07ns,variance=8.15258e+13ns^2,stddev=9.02916e+06ns")

The filter and Engine code takes ~3 times more. Conclusion: it is a good idea to use -O3 + -funroll-loops

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-26T01:19:33Z

An Yes, we need RJs patch

I get heavy load if I play a track in one deck using Linkwitz-Riley EQ and turn Gain the to zero. With the patch, there is no load change when turning to Zero. I can see similar results on I5 Notebook x64 with small Audiobuffers.

Enabling DAZ helps. I do not have a clue why this happens on a SSE2 build? According to the test above this does not happen ... So there seams to be an other issue.

Debug [Main]: ===================================== 
Debug [Main]: BASE STATS 
Debug [Main]: ===================================== 
Debug [Main]: Stat("AnalyserQueue process","count=1")
Debug [Main]: Stat("CachingReaderWorker [Channel1]","count=574")
Debug [Main]: Stat("CachingReaderWorker [Channel2]","count=1")
Debug [Main]: Stat("CachingReaderWorker [PreviewDeck1]","count=1")
Debug [Main]: Stat("CachingReaderWorker [Sampler1]","count=1")
Debug [Main]: Stat("CachingReaderWorker [Sampler2]","count=1")
Debug [Main]: Stat("CachingReaderWorker [Sampler3]","count=1")
Debug [Main]: Stat("CachingReaderWorker [Sampler4]","count=1")
Debug [Main]: Stat("EngineBuffer::process_pauselock","count=2481,sum=1.97225e+08ns,average=79494.2ns,min=30311ns,max=1.09239e+06ns,variance=3.04507e+09ns^2,stddev=55182.2ns")
Debug [Main]: Stat("EngineMaster::mixChannels_0active","count=7443,sum=4.85571e+07ns,average=6523.86ns,min=3282ns,max=730540ns,variance=1.4778e+08ns^2,stddev=12156.5ns")
Debug [Main]: Stat("EngineMaster::mixChannels_1active","count=2481,sum=3.467e+07ns,average=13974.2ns,min=7333ns,max=833346ns,variance=6.11274e+08ns^2,stddev=24724ns")
Debug [Main]: Stat("EngineMaster::process","count=4962")
Debug [Main]: Stat("EngineMaster::processChannels","count=2480,sum=2.2165e+10ns,average=8.9375e+06ns,min=960248ns,max=5.28434e+07ns,variance=1.24014e+14ns^2,stddev=1.11361e+07ns")
Debug [Main]: Stat("EngineSideChain","count=191")
Debug [Main]: Stat("EngineSideChain::process","count=192")
Debug [Main]: Stat("EngineSideChain::writeSamples","count=4962")
Debug [Main]: Stat("EngineSideChain::writeSamples wake up","count=190")
Debug [Main]: Stat("EngineWorkerScheduler","count=572")
Debug [Main]: Stat("LinkwitzRiley8EQEffect","count=2480,sum=2.11762e+10ns,average=8.5388e+06ns,min=743949ns,max=5.25722e+07ns,variance=1.24491e+14ns^2,stddev=1.11575e+07ns")
Debug [Main]: Stat("MixxxMainWindow::~MixxxMainWindow","count=1,sum=1.13623e+09ns,average=1.13623e+09ns,min=1.13623e+09ns,max=1.13623e+09ns,variance=0ns^2,stddev=0ns")
Debug [Main]: Stat("SoundDevicePortAudio::callbackProcess output 0, HDA Intel: ALC269 Analog (hw:0,0)","count=2481,sum=1.89776e+08ns,average=76491.7ns,min=22628ns,max=1.06907e+07ns,variance=2.21524e+11ns^2,stddev=470664ns")
Debug [Main]: Stat("SoundDevicePortAudio::callbackProcess prepare 0, HDA Intel: ALC269 Analog (hw:0,0)","count=2480,sum=2.34364e+10ns,average=9.45015e+06ns,min=1.14386e+06ns,max=5.31427e+07ns,variance=1.24151e+14ns^2,stddev=1.11423e+07ns")
Debug [Main]: Stat("SoundDevicePortAudio::callbackProcessClkRef 0, HDA Intel: ALC269 Analog (hw:0,0)","count=4962")
Debug [Main]: Stat("VsyncThread real time error","count=17,sum=17,average=1,min=1,max=1,variance=0^2,stddev=0")
Debug [Main]: Stat("VsyncThread usleep for VSync","count=3990")
Debug [Main]: Stat("VsyncThread vsync render","count=4024")
Debug [Main]: Stat("VsyncThread vsync swap","count=4025")
Debug [Main]: Stat("WOverview::paintEvent","count=92,sum=4.11161e+07ns,average=446914ns,min=22768ns,max=885378ns,variance=3.54065e+10ns^2,stddev=188166ns")
Debug [Main]: Stat("WVuMeter::paintEvent","count=3565,sum=2.34213e+08ns,average=65697.9ns,min=37365ns,max=1.74631e+06ns,variance=1.82733e+09ns^2,stddev=42747.3ns")
Debug [Main]: Stat("WaveformWidgetFactory::render() 2waveforms","count=2012,sum=2.01004e+10ns,average=9.99028e+06ns,min=5.18285e+06ns,max=4.00311e+07ns,variance=8.70301e+12ns^2,stddev=2.95009e+06ns")
Debug [Main]: Stat("WaveformWidgetFactory::swap() 2waveforms","count=2012,sum=2.95026e+09ns,average=1.46633e+06ns,min=880349ns,max=1.21566e+07ns,variance=5.5925e+11ns^2,stddev=747830ns")
Debug [Main]: ===================================== 
Debug [Main]: EXPERIMENT STATS 
Debug [Main]: ===================================== 
Debug [Main]: Stat("CachingReaderWorker [Channel1]","count=191")
Debug [Main]: Stat("EngineBuffer::process_pauselock","count=762,sum=6.66809e+07ns,average=87507.8ns,min=50635ns,max=1.92692e+06ns,variance=8.42527e+09ns^2,stddev=91789.2ns")
Debug [Main]: Stat("EngineMaster::mixChannels_0active","count=2289,sum=1.52183e+07ns,average=6648.44ns,min=3841ns,max=331048ns,variance=9.07444e+07ns^2,stddev=9525.98ns")
Debug [Main]: Stat("EngineMaster::mixChannels_1active","count=763,sum=1.0052e+07ns,average=13174.3ns,min=7333ns,max=181657ns,variance=5.79651e+07ns^2,stddev=7613.48ns")
Debug [Main]: Stat("EngineMaster::process","count=1525")
Debug [Main]: Stat("EngineMaster::processChannels","count=762,sum=1.24482e+09ns,average=1.63363e+06ns,min=954591ns,max=5.73739e+06ns,variance=4.66768e+11ns^2,stddev=683204ns")
Debug [Main]: Stat("EngineSideChain","count=58")
Debug [Main]: Stat("EngineSideChain::process","count=58")
Debug [Main]: Stat("EngineSideChain::writeSamples","count=1526")
Debug [Main]: Stat("EngineSideChain::writeSamples wake up","count=58")
Debug [Main]: Stat("EngineWorkerScheduler","count=192")
Debug [Main]: Stat("LinkwitzRiley8EQEffect","count=762,sum=9.44911e+08ns,average=1.24004e+06ns,min=744438ns,max=4.48667e+06ns,variance=3.10929e+11ns^2,stddev=557610ns")
Debug [Main]: Stat("SoundDevicePortAudio::callbackProcess output 0, HDA Intel: ALC269 Analog (hw:0,0)","count=763,sum=9.56296e+07ns,average=125334ns,min=22769ns,max=9.20683e+06ns,variance=4.19177e+11ns^2,stddev=647439ns")
Debug [Main]: Stat("SoundDevicePortAudio::callbackProcess prepare 0, HDA Intel: ALC269 Analog (hw:0,0)","count=762,sum=1.61254e+09ns,average=2.11619e+06ns,min=1.15643e+06ns,max=1.59874e+07ns,variance=1.52761e+12ns^2,stddev=1.23596e+06ns")
Debug [Main]: Stat("SoundDevicePortAudio::callbackProcessClkRef 0, HDA Intel: ALC269 Analog (hw:0,0)","count=1525")
Debug [Main]: Stat("VsyncThread real time error","count=3,sum=3,average=1,min=1,max=1,variance=0^2,stddev=0")
Debug [Main]: Stat("VsyncThread usleep for VSync","count=1056")
Debug [Main]: Stat("VsyncThread vsync render","count=1062")
Debug [Main]: Stat("VsyncThread vsync swap","count=1062")
Debug [Main]: Stat("WOverview::paintEvent","count=28,sum=1.42161e+07ns,average=507719ns,min=21930ns,max=1.01961e+06ns,variance=6.31043e+10ns^2,stddev=251206ns")
Debug [Main]: Stat("WVuMeter::paintEvent","count=698,sum=4.3623e+07ns,average=62497.1ns,min=38273ns,max=285162ns,variance=5.41059e+08ns^2,stddev=23260.7ns")
Debug [Main]: Stat("WaveformWidgetFactory::render() 2waveforms","count=531,sum=5.20469e+09ns,average=9.80168e+06ns,min=5.27043e+06ns,max=2.03392e+07ns,variance=5.95697e+12ns^2,stddev=2.44069e+06ns")
Debug [Main]: Stat("WaveformWidgetFactory::swap() 2waveforms","count=531,sum=7.55607e+08ns,average=1.42299e+06ns,min=884540ns,max=7.48203e+06ns,variance=4.76785e+11ns^2,stddev=690496ns")
Debug [Main]: ===================================== 
Debug [Main]: Mixxx shutdown complete with code 0

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-26T15:02:06Z

There seams to be a mess around SSE2 / SSE3 and Pentium4

See:

http://sourceforge.net/p/lmms/mailman/message/32988535/ "

Some (not sure how many) 32-bit CPUs with SSE2 don't have the DAZ flag and will crash the program trying to set it. "

https://software.intel.com/en-us/articles/x87-and-sse-floating-point-assists-in-ia-32-flush-to-zero-ftz-and-denormals-are-zero-daz " Initial steppings of Pentium® 4 processors did not support DAZ "

mixxxbot commented 1 year ago

Commented by: ywwg Date: 2014-12-26T15:31:59Z

Are there any pentium 4 laptops out there? Realistically, how many people might this affect? I would venture to guess ~zero

mixxxbot commented 1 year ago

Commented by: ywwg Date: 2014-12-26T16:02:30Z

I couldn't find a good hardware survey that showed breakdown by processor model. I also googled around to find out which steppings do or do not support DAZ. The pentium4 was in production from 2000-2008, but I'd guess that it's the really really old models that would have trouble with this mode.

This document does show how to detect if the mode is supported, if it comes to that: http://datasheets.chipdb.org/Intel/x86/CPUID/24161817.pdf

We should have at least one build of mixxx that is super-safe 32bit no special flags, just in case. But I'm still leaning toward the default build using DAZ mode.

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2014-12-26T22:36:49Z

Cool, this doc verifies the DAZ issue :-/ It could be a lot of fun to port this dazdetect.asm to gcc and mvc. But I am in doubt if this is worth the time.

For now we have the "portable" build for sse2 cpus with DAZ flag or no flag but not crashing when enabling and the "legacy" build for all older CPU.

mixxxbot commented 1 year ago

Commented by: daschuer Date: 2015-02-03T10:07:22Z

https://github.com/mixxxdj/mixxx/pull/438

mixxxbot commented 1 year ago

Issue closed with status Fix Released.

mixxxdj / mixxx

Add Denormal prevention in engine code #7747