jacksondunstan / UnityNativeScripting

Unity Scripting in C++
https://jacksondunstan.com/articles/3938
MIT License
1.34k stars 135 forks source link

UnityEngineTimePropertyGetDeltaTime's return type is wrong #29

Closed nifh80s closed 5 years ago

nifh80s commented 5 years ago

Hi! When the NativeScript.dll is used, Unity always crashes because the C++ UnityEngineTimePropertyGetDeltaTime function pointer's return type is System::Single which is diffrent from the return type of the C# delegate UnityEngineTimePropertyGetDeltaTimeDelegateType. UnityEngineTimePropertyGetDeltaTime should return float instead of System::Single.

jacksondunstan commented 5 years ago

System.Single and float are the same type, so that wouldn't be the cause of the crash. Can you provide more detail about the crash, such as timing or a stack trace?

nifh80s commented 5 years ago

Here is the stack trace: Stack Trace of Crashed Thread 18852: ERROR: SymGetSymFromAddr64, GetLastError: '试图访问无效的地址。 ' (Address: 00007FFB88D8355F) 0x00007FFB88D8355F (NativeScript_temp) (function-name not available) ERROR: SymGetSymFromAddr64, GetLastError: '试图访问无效的地址。 ' (Address: 00007FFB88D81200) 0x00007FFB88D81200 (NativeScript_temp) (function-name not available) 0x00007FFB88D83DE3 (NativeScript_temp) MyGameAbstractBaseBallScriptUpdate 0x000000001C65B555 (mscorlib) System.Object.wrapper_native_00007FFB88D83DC0() 0x000000001C7832C8 (Assembly-CSharp) MyGame.BaseBallScript.Update() 0x000000001C74BB08 (mscorlib) System.Object.runtime_invoke_void__this__() 0x00007FFB373DBE7B (mono-2.0-bdwgc) [c:\buildslave\mono\build\mono\mini\mini-runtime.c:2809] mono_jit_runtime_invoke 0x00007FFB37361E32 (mono-2.0-bdwgc) [c:\buildslave\mono\build\mono\metadata\object.c:2919] do_runtime_invoke 0x00007FFB3736AE3F (mono-2.0-bdwgc) [c:\buildslave\mono\build\mono\metadata\object.c:2966] mono_runtime_invoke 0x0000000140C00BDA (Unity) scripting_method_invoke 0x0000000140BF0F2A (Unity) ScriptingInvocation::Invoke 0x0000000140BB9BF7 (Unity) MonoBehaviour::CallMethodIfAvailable 0x0000000140BBA311 (Unity) MonoBehaviour::CallUpdateMethod 0x00000001406EC66C (Unity) BaseBehaviourManager::CommonUpdate<BehaviourManager> 0x00000001406F2D56 (Unity) BehaviourManager::Update 0x000000014095CD53 (Unity)InitPlayerLoopCallbacks'::`2'::UpdateScriptRunBehaviourUpdateRegistrator::Forward 0x000000014095B9A7 (Unity) ExecutePlayerLoop 0x000000014095BA73 (Unity) ExecutePlayerLoop 0x000000014095ED31 (Unity) PlayerLoop 0x000000014133C64F (Unity) PlayerLoopController::UpdateScene 0x000000014132C073 (Unity) PlayerLoopController::EnterPlayMode 0x0000000141337FF3 (Unity) PlayerLoopController::SetIsPlaying 0x000000014133AED2 (Unity) Application::TickTimer 0x000000014149703B (Unity) MainMessageLoop 0x0000000141498CD6 (Unity) WinMain 0x00000001424831BA (Unity) __scrt_common_main_seh 0x00007FFBA8A94034 (KERNEL32) BaseThreadInitThunk 0x00007FFBAB133691 (ntdll) RtlUserThreadStart

Stacks for Running Threads:`

The Chinese words mean "Access Invalid Address". It crashed at the call of Plugin::UnityEngineTimePropertyGetDeltaTime() in UnityEngine::Time::GetDeltaTime, and in disassembly it just crased at the mov instruction which moves the return value(it happened to be a 0x0 pointer) to a register after the call instruction.

According to what happend, I don't think that the C# delegate return type float is the same with the c++ struct System::Single when the delegate function is declared with MonoPInvokeCallback, although in c# float and System.Single are same.

By the way, I work on Window 10 x64 system. I built the x64 native script all with VS2017 and I used Unity2018.3.12 to test it. After I changed the c++ Plugin::UnityEngineTimePropertyGetDeltaTime's return type to float, everything is ok now.

jacksondunstan commented 5 years ago

I tried to reproduce this using the same environment: Windows 10 64-bit, VS 2017 build, and Unity 2018.3.12. Unfortunately, it didn't crash for me. I'm using a clean checkout of the repository with no modifications to the example Unity project. Do you have some other changes that made it happen? Is the issue only intermittently occurring?

nifh80s commented 5 years ago

It's so weird. With the clean checkout, I only moved the built NativeScript.dll from Assets\Plugins\ to Assets\Plugins\Editor without any codes changed, or it would report dll missing. My Win10 OS version is 1803, not sure if it's related. Thanks any way, and feel free to close this issue if everything is fine with you:).

nifh80s commented 5 years ago

Hi again! I found a reasonable explanation at this page: https://coherent-labs.com/posts/passing-a-struct-from-c-to-c-gone-wrong. After I followed the instructions and removed the constructors of System::Single, it didn't crash finally.

jacksondunstan commented 5 years ago

Thanks for the link; it was very informative! I see how the scenario described is similar to this one and how removing the constructors would fix the crash you're seeing. Unfortunately, I still can't reproduce it despite making several builds and running the example included in the repo several time. I'm reluctant to remove a useful feature (the constructors) until I can do that, so any tips you can give on how I can reproduce would still be very nice to have. In the meantime, I'm glad you have a workaround for the issue.

Jrius commented 5 years ago

Hello there,

I'm pretty sure this is the same bug as #11. Which means returning the value through a helper function with an out parameter should be a valid workaround in this case too. It seems the reason you are not able to reproduce the bug is because of different versions of the VS compiler. Until yesterday, I could return floats without crashing but the C++ side received garbage instead of a correct value. Then I upgraded to the latest version of VS and the VS compiler, and now it outright crashes instead of returning trash. I guess it does RVO a bit differently than it used to.

As for a solution... I manually modified the bindings in order to pass deltaTime's value from C# to C++ through a parameter/pointer instead of returning it (without relying on a custom helper method), and it seems this does the trick so far. It would probably be safer for the bindings generator to always use this instead of returns, as different compilers will have different ways to implement RVO. This should be nearly identical performance-wise ?

This has been bugging me for a while... Now that we have a clear explanation I might try to implement this into the generator when I have some time. Not sure when that happens, but if you're interested I can make a pull request for it afterwards.

jacksondunstan commented 5 years ago

@Jrius, feel free to submit this as a pull request! :)

EricZinda commented 5 years ago

Hi Jackson, I'm hitting this same issue, Visual Studio 2019, 64 bit, windows 8.

If I create this simple class on the C# side: public class General { public static int Test(Int32 arg) { return 5; } }

The generator converts it to this: Plugin::ManagedUtilitiesGeneralMethodTestSystemInt32 = *(System::Int32 (**)(int32_t arg))curMemory;

And it crashes when used. If I manually change it to this (in all the right places): Plugin::ManagedUtilitiesGeneralMethodTestSystemInt32 = *(int32_t (**)(int32_t arg))curMemory;

It works fine.

I know I don't understand all the cases you're covering in the code, but shouldn't the base signature in CPP always use native types (like int32_t) as opposed to the wrappers (like System::Int32)? For both arguments and return values?

Jrius commented 5 years ago

int32_t and System::Int32 have the exact same memory layout (4 bytes), so you can use either one in the signature and it will "just work" as they are compatible. Using the C# name is more consistent with the rest of the bindings, but in the end it's mostly a matter of preference.

The crash you're seeing, however, is due to System::Int32 specifying a constructor. This tips the (overzealous) compiler into thinking, "oh, it has a constructor, so always returning a copy must be expensive. Let's optimize this call by moving the memory instead of copying it !" (this is called Return-Value Optimization, if you don't use C++ regularly you may not have heard of it). The problem is, RVO expects both the method's implementation and its call to be compiled differently. However, the method's implementation is on the C# side, which doesn't know anything about RVO... See where this is going ? The C++ side invokes the non-optimized C# method as if it were RVOized, so this simply crashes everything. This is actually a problem with several types besides just int, so fixing it is not as simple as changing one or two return types.

Unfortunately, for some reason C++ does not expose any way to selectively disable the overzealous RVO for some methods. As for C#, it simply cannot guess whether RVO is used or not because RVO is implemented differently across compilers. This is why this bug happens much more rarely on Mac. In my experience, this issue got worse recently due to either the .NET 4 runtime, or an upgrade to the VS compiler.

The workaround is to never use the return keyword, and instead return by pointer parameter (or out from the C# side). You can do this pretty easily by writing your own helper method (even though it's more annoying).

I have a modified version of the project that automatically returns by pointer arg, and so far it seems to be working really well on a fairly big project ! But it's based off a custom and outdated version of the scripts, so I'll have to do a bit of work to upgrade it to the current version.

jacksondunstan commented 5 years ago

I just pushed b508933669b0c6de493996bce9d52470ba5b699c to return primitive types instead of struct types (e.g. int32_t instead of System::Int32) to fix this issue. The broader issue (#11) that @Jrius points out is still a problem, but this should fix at least the case of primitives.

EricZinda commented 5 years ago

I picked up the fix and it fixes my issue. Thanks Jackson!