mackron / dr_libs

Audio decoding libraries for C/C++, each in a single source file.
Other
1.24k stars 205 forks source link

dr_mp3 should not assume SSE2 support for WIN32 i386 #237

Closed glebm closed 2 years ago

glebm commented 2 years ago

I'm trying to port a game to the original Xbox using the open-source NXDK.

One issue I've encountered is that dr_mp3 simply assumes SSE2 support here:

https://github.com/mackron/dr_libs/blob/15f37e3ab01654c1a3bc98cff2a9ca64e8296fa9/dr_mp3.h#L604

The original Xbox is __i386__ but does not support SSE2.

Compilation fails with:

dr_libs/dr_mp3.h:2370:41: error: always_inline function '_mm_cvtps_epi32' requires target feature 'sse2', but would be inlined into function 'drmp3dec_f32_to_s16' that is compiled without support for 'sse2'
                                        _mm_cvtps_epi32(_mm_max_ps(_mm_min_ps(b, s16max), s16min)));
mackron commented 2 years ago

The defined(__SSE2__) part of that line is supposed to guard this. It sounds to me that the __SSE2__ option is incorrectly being defined at compile time. What's the compiler? On the surface this feels like a compiler-specific issue. I just don't see why the compiler will be throwing that error, but still defining __SSE2__.

glebm commented 2 years ago

The defined(SSE2) part of that line is supposed to guard this.

It doesn't, here is the formatted conditional:

(
   (defined(_MSC_VER) && _MSC_VER >= 1400) && 
   (defined(_M_IX86) || defined(_M_X64))
) || (
   (defined(__i386__) || defined(__x86_64__)) && defined(__SSE2__)
)

So if _M_IX86 is defined, it does not check __SSE2__

glebm commented 2 years ago

Perhaps this should just be:

- #if ((defined(_MSC_VER) && _MSC_VER >= 1400) && (defined(_M_IX86) || defined(_M_X64))) || ((defined(__i386__) || defined(__x86_64__)) && defined(__SSE2__)) 
+ #if defined(_MSC_VER) && _MSC_VER >= 1400 && defined(__SSE2__)
mackron commented 2 years ago

You're absolutely right, I missed that. That _M_IX86 check is in the wrong part of that line. It's currently in the x64 check, but needs to be moved to the x86+SSE2 check.

This line is actually from the minimp3 part of dr_mp3 (dr_mp3 is a wrapper around minimp3). I'll fix it in dr_mp3 in a bit. Thanks for the report!

@lieff - I think this change needs to be applied to minimp3?

mackron commented 2 years ago

I've pushed a fix to the dev branch. Are you able to give that a try?

Just reviewing that line it seems a quite confusing. I might refactor that later.

glebm commented 2 years ago

Verified that it now compiles!

sezero commented 2 years ago

Are you guys sure that MSVC really defines __SSE2__? As far as I remember, it does not (but I'd like to be corrected), therefore the _M_IX86 + __SSE2__ combination is useless there.

AJenbo commented 2 years ago

Should be _M_IX86_FP == 2, see https://docs.microsoft.com/en-us/cpp/preprocessor/predefined-macros?view=msvc-170

mackron commented 2 years ago

In that line, the __SSE2__ is intended to be used for checking support for non-MSVC compilers. But you're right, it's not defined by MSVC. It's from the minimp3 project which I didn't write and it's a different style to the way I do it. I usually just check the version of MSVC to determine if the intrinsics are available at compile time, and then I do a runtime check to determine whether or not it's safe to execute those instructions.

I've added a check for _M_IX86_FP == 2 to the dev branch. Thanks for the heads up on that one. Hopefully that hasn't broken the fix I did in the previous commit 🤞.

AJenbo commented 2 years ago

Can we close this now? @glebm Did you update after the latest change?

glebm commented 2 years ago

The issue on the xbox is fixed, I haven't updated SDL_audiolib since the latest change but the change looks OK