RenderKit / embree

Embree ray tracing kernels repository.
Apache License 2.0
2.32k stars 383 forks source link

Fix compiling for Arm64 Windows with MSVC #483

Open aaronfranke opened 2 months ago

aaronfranke commented 2 months ago

This PR fixes 3 issues when compiling for Arm64 Windows with MSVC.

#if !defined(_M_IX86) && !defined(_M_X64) && !(defined(_M_ARM64) && defined(USE_SOFT_INTRINSICS))
#error This header is specific to X86, X64, ARM64, and ARM64EC targets
#endif
aaronfranke commented 2 months ago

@freibold This PR is needed to fix Arm64 Windows builds with MSVC. Let me know if I can do anything to push this forward.

freibold commented 2 months ago

Hi @aaronfranke! Thanks for this PR. I had time to look at this end of last week. However, I had issues getting this to work using Visual Studio 2022 "ARM edition". It was compiling but the linker complained that it can't find "__mm_load_ps", "mm_store_ps", and other SSE intrinsic. I was hoping that adding the missing USE_SOFT_INTRINSICS in common/sys/thread.cpp would fix the issue but no luck so far. Did you add any additional CMake or CXX flags on your side?

On a more high-level I don't think the software intrinsic are the way to go. I was trying to enable ARM NEON via the SSE2NEON "library" like we do for Apple ARM but I hit a roadblock there, too. However, I think ARM NEON is the right for performance reasons. Anyhow, if you can help me out with getting your USE_SOFT_INTRINSICS solution to work, I'm happy to merge that in as a short-term Windows ARM solution.

aaronfranke commented 2 months ago

@freibold Actually, that was where I got stuck too, it fails at the linker. https://github.com/godotengine/godot/pull/91360#issuecomment-2088580217

At the time I submitted this PR (and at the time I wrote the ping to you) I had only tested compiling before the linker.

You are likely correct that the USE_SOFT_INTRINSICS stuff is completely wrong, but I believe the bsf, bscf, and bsr changes are correct, and the #if UINTPTR_MAX == UINT64_MAX change too. Hopefully this PR would be a good base for building a solution using Arm Neon. I'm completely lost when it comes to this, however. If you want I can remove the USE_SOFT_INTRINSICS from this PR so you can merge in the rest of the changes.