jrouwe / JoltPhysics

A multi core friendly rigid body physics and collision detection library. Written in C++. Suitable for games and VR applications. Used by Horizon Forbidden West.
MIT License
6.42k stars 414 forks source link

Misaligned ABI when using JPH_DOUBLE_PRECISION causes a problem on the VirtualCharacter #1071

Closed AndreaCatania closed 4 months ago

AndreaCatania commented 4 months ago

Hello, I'm integrating Jolt into UE5 and while integrating the VirtualCharacter I've encountered a weird crash.

After a bunch of tests I've noticed that CharacterVirtual->GetPosition(); and CharacterVirtual->GetRotation(); returned a different value than the one specified during the initialization so I added two checks to verify that.

JPH::CharacterVirtualSettings Settings;
[...]

JPH::CharacterVirtual *CharacterVirtual = new JPH::CharacterVirtual(
    &Settings,
    Position,
    Quat,
    JWorld->World_GetPhysicsSystem());

const JPH::RVec3 FetchPosition = CharacterVirtual->GetPosition();
const JPH::Quat FetchQuat = CharacterVirtual->GetRotation();

check(FetchPosition == Position);
check(FetchQuat == Quat);

The above checks are not satisfied. I suspected an ABI misalignment. I've then found that compiling the library with double precision disabled (-DDOUBLE_PRECISION=OFF) fixed the issue.

Consider that on the app side I've defined JPH_DOUBLE_PRECISION using PublicDefinitions.Add("JPH_DOUBLE_PRECISION=1"); so I'm not sure what is causing this issue. Do you have any idea?

jrouwe commented 4 months ago

The only thing I can think of is that you have a mismatch in the JPH_DOUBLE_PRECISION define between the two projects. The CharacterVirtual constructor literally just puts the position and rotation as a member and GetPosition/Rotation returns that member. By stepping through the disassembly you can see if both sides use 256 bit registers and if the offsets where it stores/gets the values are the same.

AndreaCatania commented 4 months ago

I wrote #ifdef JPH_DOUBLE_PRECISION to verify that on Unreal Engine side the macro is defined and indeed it's defined. Which is weird. Consider that everything else works just fine. It's just the VirtualCharacter affected. I'll try to verify what you suggest, but I'm not entirely sure how to do it properly. Though, isn't the above test (using #ifdef JPH_DOUBLE_PRECISION) just enough to verify that the macro is indeed defined properly?

AndreaCatania commented 4 months ago

image

Check this out, the inPosition has a different value when compared to mPosition (on the VirtualCharacter); both should be the same at that point.

jrouwe commented 4 months ago

I would say this is a typical symptom of a mismatch in preprocessor defines, but maybe there are other compiler flags that affect the class layout.

jrouwe commented 4 months ago

I wrote #ifdef JPH_DOUBLE_PRECISION to verify that on Unreal Engine side the macro is defined and indeed it's defined. Which is weird.

And have you done the same in the Jolt code to verify that it too is being compiled with JPH_DOUBLE_PRECISION?

AndreaCatania commented 4 months ago

I would say this is a typical symptom of a mismatch in preprocessor defines, but maybe there are other compiler flags that affect the class layout.

Yeah, I totally agree. I think JPH_DOUBLE_PRECISION is not the main reason.

And have you done the same in the Jolt code to verify that it too is being compiled with JPH_DOUBLE_PRECISION?

Yes, I just tested it again. I've added the following static function, inside the CharacterVirtual.cpp, and it returns true, when called from UnrealEngine.

image

This is the command I use to compile jolt:

cmd.exe /c cmake.exe  -S [...]\Jolt\LibraryJolt\JoltPhysics\Build  -B [...]\Jolt\LibraryJolt\JOLT_BINARY  -G "Visual Studio 17 2022"  -DCROSS_PLATFORM_DETERMINISTIC=ON  -DDOUBLE_PRECISION=ON  -DOBJECT_LAYER_BITS=16  -DINTERPROCEDURAL_OPTIMIZATION=ON  -DCMAKE_EXPORT_COMPILE_COMMANDS=ON  -DTARGET_SAMPLES=OFF  -DTARGET_HELLO_WORLD=OFF  -DTARGET_VIEWER=OFF  -DTARGET_UNIT_TESTS=OFF  -DTARGET_PERFORMANCE_TEST=OFF  -DUSE_AVX512=OFF  -DUSE_AVX2=OFF  -DUSE_AVX=OFF  -DUSE_SSE4_2=ON  -DUSE_SSE4_1=ON  -DUSE_LZCNT=ON  -DUSE_TZCNT=ON  -DUSE_F16C=ON  -DUSE_FMADD=ON  -DUSE_STATIC_MSVC_RUNTIME_LIBRARY=0

I wonder, is it possible to get the list of all the preprocessor-defines defined by CMake?

jrouwe commented 4 months ago

It looks like you're generating a visual studio project, so wouldn't right clicking the project, Properties, C++, Command Line tell you the exact command line that is used (including all preprocessor defines)?

AndreaCatania commented 4 months ago

This is the command line defines:

\Jolt\LibraryJolt\JOLT_BINARY\Debug\Jolt.pdb" /Zc:inline /fp:precise /D "_MBCS" /D "WIN32" /D "_WINDOWS" /D "_DEBUG" /D "JPH_PROFILE_ENABLED" /D "JPH_DEBUG_RENDERER" /D "JPH_FLOATING_POINT_EXCEPTIONS_ENABLED" /D "JPH_DOUBLE_PRECISION" /D "JPH_CROSS_PLATFORM_DETERMINISTIC" /D "JPH_OBJECT_LAYER_BITS=16" /D "JPH_USE_SSE4_1" /D "JPH_USE_SSE4_2" /D "JPH_USE_LZCNT" /D "JPH_USE_TZCNT" /D "JPH_USE_F16C" /D "CMAKE_INTDIR=\"Debug\"" /fp:except- /errorReport:prompt /WX /Zc:forScope /RTC1 /GR- /Gd /MDd /std:c++17 /FC /Fa"Jolt.dir\Debug\" /EHsc /nologo /Fo"Jolt.dir\Debug\" 

This is the check on the Unreal Engine side.

#ifndef JPH_PROFILE_ENABLED
    check(false);
#endif
#ifndef JPH_DEBUG_RENDERER
    check(false);
#endif
#ifndef JPH_FLOATING_POINT_EXCEPTIONS_ENABLED
    check(false);
#endif
#ifndef JPH_DOUBLE_PRECISION
    check(false);
#endif
#ifndef JPH_CROSS_PLATFORM_DETERMINISTIC
    check(false);
#endif
#ifndef JPH_OBJECT_LAYER_BITS
    check(false);
#endif
#ifndef JPH_USE_SSE4_1
    check(false);
#endif
#ifndef JPH_USE_SSE4_2
    check(false);
#endif
#ifndef JPH_USE_LZCNT
    check(false);
#endif
#ifndef JPH_USE_TZCNT
    check(false);
#endif
#ifndef JPH_USE_F16C
    check(false);
#endif

I assume these are the only "external" defines which are required to have on the Unreal Engine side too. Everything seems correct till this point. I wonder if the library linking is broken instead, somehow.

AndreaCatania commented 4 months ago

@jrouwe using this function: JPH::GetConfigurationString(); I've obtained a string containing all the preprocessor defines, from Unreal Engine.

Double precision x86 64-bit with instructions: SSE2 SSE4.1 SSE4.2 F16C LZCNT TZCNT FMADD (Cross Platform Deterministic) (FP Exceptions) 

Then, I've duplicated that function inside a jolt lib CPP file and compared the two strings:

Double precision x86 64-bit with instructions: SSE2 SSE4.1 SSE4.2 F16C LZCNT TZCNT (Cross Platform Deterministic) (FP Exceptions)

As you can notice, even if jolt is compiled using -DUSE_FMADD=ON JPH_FMADD is not defined inside the Jolt Lib. I've mentioned this problem here. We need a more reliable way to detect when these macros are defined or not.

Anyway, I've removed the extra and not used preprocessor define JPH_FMADD, but the issue is still there. I'm still investigating.

AndreaCatania commented 4 months ago

The CharacterVirtual alignment is 32 on both sides, while it has a different size. 416 byte on the Jolt Library vs 384 on the Unreal Engine side.

image

jrouwe commented 4 months ago

If I run cmake with your options (with -DTARGET_SAMPLES=ON so I have an executable to measure the size in) then if I add to the constructor of CharacterVirtual Trace("size=%d, align=%d", sizeof(CharacterVirtual), alignof(CharacterVirtual)); the result is size=448, align=32

I use a MSVC plugin called Struct Layout to display the layout of the class and this agrees:

image

However in MSVC 17.9 they added a tool called 'Memory Layout' which agrees with your measurement:

image

The difference is in the beginning of the class, Struct Layout aligns the base class RefTarget<...> to 16 bytes:

image

While Memory Layout does not:

image

If I add Trace("mSystem=%d", offsetof(CharacterVirtual, mSystem)); then the output is mSystem=40, which agrees with Struct Layout.

Switching to MSVC 2019 or a MSVC Clang-17 build gives me the same trace output.

Running the same code on Linux with clang however gives me size=384, align=32 and mSystem=16 just like Unreal.

It appears that MSVC aligns the first member of the class to the largest alignment of any of the members of the class which doesn't really make sense to me, but a simple experiment in godbolt confirms it. If you switch from MSVC to clang x64 18 and suppress a warning -Wno-invalid-offsetof then the static asserts trigger:

<source>:18:15: error: static assertion failed due to requirement 'sizeof(TestClass) == 48'
   18 | static_assert(sizeof(TestClass) == 48);
      |               ^~~~~~~~~~~~~~~~~~~~~~~
<source>:18:33: note: expression evaluates to '32 == 48'
   18 | static_assert(sizeof(TestClass) == 48);
      |               ~~~~~~~~~~~~~~~~~~^~~~~
<source>:19:15: error: static assertion failed due to requirement '__builtin_offsetof(TestClass, mValue) == 16'
   19 | static_assert(offsetof(TestClass, mValue) == 16);
      |               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/clang-18.1.0/lib/clang/18/include/__stddef_offsetof.h:11:24: note: expanded from macro 'offsetof'
   11 | #define offsetof(t, d) __builtin_offsetof(t, d)
      |                        ^
<source>:19:43: note: expression evaluates to '8 == 16'
   19 | static_assert(offsetof(TestClass, mValue) == 16);
      |               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~
<source>:20:15: error: static assertion failed due to requirement '__builtin_offsetof(TestClass, mValue2.mValue) == 32'
   20 | static_assert(offsetof(TestClass, mValue2.mValue) == 32);
      |               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/clang-18.1.0/lib/clang/18/include/__stddef_offsetof.h:11:24: note: expanded from macro 'offsetof'
   11 | #define offsetof(t, d) __builtin_offsetof(t, d)
      |                        ^
<source>:20:51: note: expression evaluates to '16 == 32'
   20 | static_assert(offsetof(TestClass, mValue2.mValue) == 32);
      |               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~
3 errors generated.
Compiler returned: 1

Running either clang++.exe or clang-cl.exe on Windows don't trigger this error, so it must be how the windows ABI is defined. I have no idea what Unreal does to negate this, but you could try the test code in Unreal to see what it does and you could try finding out what offsetof(CharacterVirtual, mSystem) is on both sides to see if this is indeed what the issue is.

AndreaCatania commented 4 months ago

Thanks a lot for the detailed answer. offsetof(CharacterVirtual, mSystem) returns 40 from both sides. I've verified that the offsets for all the properties of CharacterBase are correctly aligned between the two projects.

The misalignment starts from CharacterVirtual and indeed it's first property is misaligned:

JOLT offsetof(CharacterVirtual, mListener); 224
UNREAL offsetof(CharacterVirtual, mListener); 208

image

I think the problem can be fixed by using the same toolchain that unreal is using to compile Jolt. I just need to figure out how, now.

jrouwe commented 4 months ago

Ok, in the mean time I reported the problem with the Memory Layout tool to Microsoft here.

Converting this to a discussion as I don't think there's a bug in Jolt.