Open aaronfranke opened 2 months ago
@freibold This PR is needed to fix Arm64 Windows builds with MSVC. Let me know if I can do anything to push this forward.
Hi @aaronfranke! Thanks for this PR. I had time to look at this end of last week. However, I had issues getting this to work using Visual Studio 2022 "ARM edition". It was compiling but the linker complained that it can't find "__mm_load_ps", "mm_store_ps", and other SSE intrinsic. I was hoping that adding the missing USE_SOFT_INTRINSICS in common/sys/thread.cpp would fix the issue but no luck so far. Did you add any additional CMake or CXX flags on your side?
On a more high-level I don't think the software intrinsic are the way to go. I was trying to enable ARM NEON via the SSE2NEON "library" like we do for Apple ARM but I hit a roadblock there, too. However, I think ARM NEON is the right for performance reasons. Anyhow, if you can help me out with getting your USE_SOFT_INTRINSICS solution to work, I'm happy to merge that in as a short-term Windows ARM solution.
@freibold Actually, that was where I got stuck too, it fails at the linker. https://github.com/godotengine/godot/pull/91360#issuecomment-2088580217
At the time I submitted this PR (and at the time I wrote the ping to you) I had only tested compiling before the linker.
You are likely correct that the USE_SOFT_INTRINSICS
stuff is completely wrong, but I believe the bsf
, bscf
, and bsr
changes are correct, and the #if UINTPTR_MAX == UINT64_MAX
change too. Hopefully this PR would be a good base for building a solution using Arm Neon. I'm completely lost when it comes to this, however. If you want I can remove the USE_SOFT_INTRINSICS
from this PR so you can merge in the rest of the changes.
This PR fixes 3 issues when compiling for Arm64 Windows with MSVC.
defined(__X86_64__) || defined(__aarch64__)
check with a platform-agnosticUINTPTR_MAX == UINT64_MAX
that relies onstdint.h
(please let me know if any of Embree's targets do not havestdint.h
, my assumption is that this is fine to depend on, but I know that MSVC versions from before 2010 don't have it). MSVC does not define__aarch64__
, but it does define_M_ARM64
. Adding|| defined(_M_ARM64)
would also fix the problem, but it's best to make this platform-agnostic.bsf
,bscf
, andbsr
. Without it, this function was not being defined on MSVC. Same as the above point, MSVC does not define__aarch64__
, but it does define_M_ARM64
. Adding|| defined(_M_ARM64)
would also fix the problem, but it's best to make this platform-agnostic. Usingsize_t
with#if
guards for 64-bit is the same as just usinguint64_t
. Note that functions may be automatically removed by the compiler if unused, so there is no reason to explicitly wrap it in#if
guards.#include <immintrin.h>
, check if_M_ARM64
is defined (meaning Arm64 Windows MSVC) and if so defineUSE_SOFT_INTRINSICS
. MSVC providesimmintrin.h
with software support on Arm64, but this has to be enabled explicitly. The MSVC-providedimmintrin.h
contains this code: