microsoft / DirectXMath

DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
https://walbourn.github.io/introducing-directxmath/
MIT License
1.51k stars 231 forks source link

Fix for MINGW cpuid.h being called instead of intrin.h. #176

Closed JPeterMugaas closed 5 months ago

JPeterMugaas commented 6 months ago

Fixes issue described at: https://github.com/microsoft/DirectXTex/pull/433

walbourn commented 6 months ago

Hmm... _WIN32 is a Windows.h only symbol, not something defined by the compiler and I by design do not require the Windows.h header for DirectXMath.

The _MSV_VER symbol means it's MSVC or clang-cl. It should not be defined for MinGW or direct clang scenarios.

Again, what specific scenario is the problem?

JPeterMugaas commented 6 months ago

Hmm... _WIN32 is a Windows.h only symbol, not something defined by the compiler and I by design do not require the Windows.h header for DirectXMath.

The _MSV_VER symbol means it's MSVC or clang-cl. It should not be defined for MinGW or direct clang scenarios.

Again, what specific scenario is the problem?

I was getting errors when trying to compile DirectxTex with OpenEXR support and got this compiler error when using both GNU GCC and CLang MINGW compiler toolchains:

In file included from C:/msys64/mingw64/include/intrin.h:41,
                 from C:/msys64/mingw64/include/Imath/half.h:183,
                 from C:/msys64/mingw64/include/OpenEXR/ImfXdr.h:79,
                 from C:/msys64/mingw64/include/OpenEXR/ImfAttribute.h:18,
                 from C:/msys64/mingw64/include/OpenEXR/ImfHeader.h:25,
                 from C:/msys64/mingw64/include/OpenEXR/ImfRgbaFile.h:22,
                 from C:/msys64/home/jpmugaas/directxtex-PKGBUILD/src/directxtex-dec2023/Auxiliary/DirectXTexEXR.cpp:46:
C:/msys64/mingw64/include/psdk_inc/intrin-impl.h:2013:42: error: macro "__cpuid" requires 5 arguments, but only 2 given
 2013 | void __cpuid(int CPUInfo[4], int InfoType);
      |                                          ^
In file included from C:/msys64/mingw64/include/directxmath/DirectXMath.h:122,
                 from C:/msys64/mingw64/include/directxmath/DirectXPackedVector.h:12,
                 from C:/msys64/home/jpmugaas/directxtex-PKGBUILD/src/directxtex-dec2023/DirectXTex/DirectXTexP.h:155,
                 from C:/msys64/home/jpmugaas/directxtex-PKGBUILD/src/directxtex-dec2023/Auxiliary/DirectXTexEXR.cpp:10:
C:/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/13.2.0/include/cpuid.h:235: note: macro "__cpuid" defined here
  235 | #define __cpuid(level, a, b, c, d)                                      \
      | 
C:/msys64/mingw64/include/psdk_inc/intrin-impl.h:2016:42: error: macro "__cpuid" requires 5 arguments, but only 2 given
 2016 | void __cpuid(int CPUInfo[4], int InfoType) {
      |                                          ^
C:/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/13.2.0/include/cpuid.h:235: note: macro "__cpuid" defined here
  235 | #define __cpuid(level, a, b, c, d)                                      \
      | 
C:/msys64/mingw64/include/psdk_inc/intrin-impl.h:2013:6: error: variable or field '__cpuid' declared void
 2013 | void __cpuid(int CPUInfo[4], int InfoType);
      |      ^~~~~~~
C:/msys64/mingw64/include/psdk_inc/intrin-impl.h:2016:6: error: variable or field '__cpuid' declared void
 2016 | void __cpuid(int CPUInfo[4], int InfoType) {
      |      ^~~~~~~
C:/msys64/mingw64/include/psdk_inc/intrin-impl.h:2017:4: error: expected primary-expression before '__asm__'
 2017 |    __asm__ __volatile__ (
      |    ^~~~~~~
C:/msys64/mingw64/include/psdk_inc/intrin-impl.h:2017:4: error: expected '}' before '__asm__'
C:/msys64/mingw64/include/psdk_inc/intrin-impl.h:2016:44: note: to match this '{'
 2016 | void __cpuid(int CPUInfo[4], int InfoType) {
      |                                            ^
C:/msys64/mingw64/include/psdk_inc/intrin-impl.h:2280:1: error: expected declaration before '}' token
 2280 | }
      | ^

The incorrect __cpuid intrinsic is being used. For Windows, we want to use the one in intrin.h and not the one from cpuid.h. Otherwise, you get the conflict you see above because OpenEXR chose to use the one from intrin.h.

My understanding about defines is this:

_WIN32 - defined for the Windows platform through Windows.h. MINGW and MS Visual C++ support this define since they have Windows.h. _MSV_VER - defined for MS Visual C++ and clang-cl. MINGW does not support this define. __clang__ - CLang compiler. This is defined regardless of platform. Thus, Clang for Windows and Linux both define it. __GNUC__ - GNU GCC compiler. This is defined regardless of platform. Thus, GCC for Windows and Linux both define it. __x86_64__ - x64 64-bit chip architecture. This is defined regardless of platform. __i386__ - Intel 386 32-bit chip architecture. This is defined regardless of platform.

There is a reference for defines at: https://sourceforge.net/p/predef/wiki/Compilers/ .

The fix I propose will select the cpuid intrinsic in intrin.h instead of the one in cpuid.h if compiling for the Windows platform in both Visual C++ and MINGW (both GNU GCC and CLang). The cpuid in cpuid.h will be used if compiling DirectXMath in Linux.

walbourn commented 6 months ago

The fix the other person proposed for DirectXMath was supposed to already address this. Hmm.

JPeterMugaas commented 6 months ago

We must have missed it earlier. I did go ahead and change _WIN32 to __MINGW32__ . Hopefully, that is more in line of what you intended.

nirbheek commented 6 months ago

Hm, I am not entirely sure I understand how this is happening.

GCC's cpuid.h does a #define __cpuid(level, a, b, c, d) — the 5-argument cpuid, and invokes assembly to call cpuid

MinGW's intrin.h is trying to emulate MSVC's intrin.h, so it does a #include <cpuid.h> then #undef __cpuid and then #include <psdk_inc/intrin-impl.h> which declares the 2-argument __cpuid (not as a pre-processor define), and uses the built-in __cpuid if you're using a new-enough GCC.

So the problem was that depending on which header(s) your project includes, a different cpuid will be available to DirectXMath. The fix was to detect which one was available by checking for defined(__cpuid), which tells us the 5-argument cpuid is being used.

I think there's something funky happening with OpenEXR's header includes here. Can you point me to the specific OpenEXR source file that is triggering this error? I can look through it and figure that out.

JPeterMugaas commented 6 months ago

I think it is in include/Imath/half.h , and the relevant block (around line 180) says:

#ifdef __CUDA_ARCH__
// do not include intrinsics headers on Cuda
#elif defined(_WIN32)
#    include <intrin.h>
#elif defined(__x86_64__)
#    include <x86intrin.h>
#elif defined(__F16C__)
#    include <immintrin.h>
#endif

include/Imath/half.h is in mingw-w64's Imath package. The choice for Imath seems to be deliberate. OpenEXR 3.2.1 depends upon Imath 3.1.9 on my system. The Imath URL is https://www.openexr.com/ and the GitHub link is https://github.com/AcademySoftwareFoundation/Imath .

I think your patch combined along with the change I posted will help the situation. I only realized what was happening when walbourn pointed out the defines used in DirectXMath ( see: https://github.com/microsoft/DirectXTex/pull/433 ).

There is an interesting question, what if the DirectXMath user does not chose a cpuid implementation at all. The code in DirectXMath would break. That's probably why the authors of Imath chose the intrin.h cpuid especially since developers often use Microsoft's Win32 implementation. I guess it became a "de facto" standard in Windows.

JPeterMugaas commented 5 months ago

Perhaps, I haven't been making clear in this PR. Technically, in MINGW, you have a choice between using cpuid.h with its 5 parameter cpuid or using intrin.h with its 2 parameter cpuid. In actual fact, you probably do not have such a choice since cpuid.h implementation conflicts with the standard 2 parameter intrin.h __cpuid following a standard documented ( https://learn.microsoft.com/en-us/cpp/intrinsics/cpuid-cpuidex?view=msvc-170 ) practice in Win32 programming.

The PR simply chooses the 2 parameters __cpuid call in MINGW. The problem was the incorrect _cpuid from cpuid.h was being selected because MINGW compilers were Clang and GNU GCC.

nirbheek commented 5 months ago

Revisiting this, I think I have to agree that it's a futile endeavour to try and support the old MinGW-GCC 5-parameter __cpuid and we should just require the 2-parameter __cpuid by explicitly doing #include <intrin.h> and #include <cpuid.h>.

So I think this PR LGTM.

walbourn commented 5 months ago

/azp run DirectXMath-GitHub,DirectXMath-GitHub-Dev17,DirectXMath-GitHub-WSL,DirectXMath-GitHub-WSL-11,DirectXMath-GitHub-MinGW,DirectXMath-GitHub-CMake-Dev17

azure-pipelines[bot] commented 5 months ago
Azure Pipelines successfully started running 3 pipeline(s).
JPeterMugaas commented 5 months ago

There is an alternative to this patch that might work on some older MINGW versions, but that approach requires a "configuration" approach in DirectXMath. I have started on that at https://github.com/microsoft/DirectXMath/pull/177 .

walbourn commented 5 months ago

/azp run DirectXMath-GitHub,DirectXMath-GitHub-Dev17,DirectXMath-GitHub-WSL,DirectXMath-GitHub-WSL-11,DirectXMath-GitHub-MinGW,DirectXMath-GitHub-CMake-Dev17

azure-pipelines[bot] commented 5 months ago
Azure Pipelines successfully started running 3 pipeline(s).