alliedmodders / metamod-source

Metamod:Source - C++ Plugin Environment and Detour Library for the Source Engine
http://www.metamodsource.net/
Other
387 stars 90 forks source link

Linux Dota 2 Metamod segfault on shutdown #42

Open stanriders opened 6 years ago

stanriders commented 6 years ago
Thread 1 "dota2" received signal SIGSEGV, Segmentation fault.
0x00007fffefc13de0 in ?? ()
(gdb) bt
#0  0x00007fffefc13de0 in ?? ()
#1  0x00007ffffcb59572 in ?? () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libvstdlib.so
#2  0x00007ffff44de323 in ?? ()
   from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/../../dota/bin/linuxsteamrt64/libserver.so
#3  0x00007ffff44debdc in ?? ()
   from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/../../dota/bin/linuxsteamrt64/libserver.so
#4  0x00007ffff3e61968 in ?? ()
   from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/../../dota/bin/linuxsteamrt64/libserver.so
#5  0x00007ffff3e67e04 in ?? ()
   from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/../../dota/bin/linuxsteamrt64/libserver.so
#6  0x00007ffff320cfa3 in ?? ()
   from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/../../dota/bin/linuxsteamrt64/libserver.so
#7  0x00007ffffd4e29e2 in ?? () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libengine2.so
#8  0x00007ffffd4d81bb in ?? () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libengine2.so
#9  0x00007ffff4668606 in ?? ()
   from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/../../dota/bin/linuxsteamrt64/libserver.so
#10 0x00007ffff359f083 in ?? ()
   from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/../../dota/bin/linuxsteamrt64/libserver.so
#11 0x00007ffff66e3a78 in ISource2Server::Shutdown (this=0x7ffff63e9de0)
    at /mnt/c/Projects/Stuff/metamod/loader/gamedll.cpp:439
#12 0x00007ffffd7d7cad in ?? () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libengine2.so
#13 0x00007ffffd7d89df in ?? () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libengine2.so
#14 0x00007ffffd59e481 in ?? () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libengine2.so
#15 0x00007ffffd4c8bab in ?? () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libengine2.so
#16 0x00007ffffd4c9149 in Source2Main () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libengine2.so
#17 0x000000000800118f in ?? ()
#18 0x00007ffffe870830 in __libc_start_main (main=0x80010c0, argc=8, argv=0x7ffffffde3c8, init=<optimized out>,
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffffffde3b8) at ../csu/libc-start.c:291
#19 0x0000000008001225 in _start ()
(gdb)

If I remove all MM plugins backtrace becomes smaller

Thread 1 "dota2" received signal SIGSEGV, Segmentation fault.
0x00007ffffcb61832 in ?? () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libvstdlib.so
(gdb) bt
#0  0x00007ffffcb61832 in ?? () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libvstdlib.so
#1  0x00007ffff0825f90 in ?? () from /mnt/c/Projects/Stuff/dota2/game/dota/bin/linuxsteamrt64/libhost.so
#2  0x00007ffffd7d7cad in ?? () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libengine2.so
#3  0x00007ffffd7d89df in ?? () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libengine2.so
#4  0x00007ffffd59e481 in ?? () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libengine2.so
#5  0x00007ffffd4c8bab in ?? () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libengine2.so
#6  0x00007ffffd4c9149 in Source2Main () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libengine2.so
#7  0x000000000800118f in ?? ()
#8  0x00007ffffe870830 in __libc_start_main (main=0x80010c0, argc=8, argv=0x7ffffffde3c8, init=<optimized out>,
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffffffde3b8) at ../csu/libc-start.c:291
#9  0x0000000008001225 in _start ()
(gdb)

At that point I thought it's dota issue, but if I remove metamod completely dota starts and stops properly without segfaulting.

Tested on WSL Ubuntu and Debian 9 Minimal.

psychonic commented 6 years ago

This has been a long-standing issue since porting MM:S to Source 2. I've not been able to find the cause yet.

stanriders commented 6 years ago

Ah, that's sad. Alright, I'll try to find what causing that myself as well

psychonic commented 6 years ago

Good luck! Given where it crashes, my best guess is that it's due to some change (in MM:S for S2, or in vstdlib) with ConVar/ConCommand registration. I think that's the only real thing we use from there in vstdlib, and they do get cleaned up at shutdown.

stanriders commented 6 years ago

Fiddling around with shutdown sequence made me sure that you're right, there's something wrong with ConVar unregs. Best trace i had so far looks like this:

0x00007ffffcb5bae0 in ?? () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libvstdlib.so
(gdb) bt
#0  0x00007ffffcb5bae0 in ?? () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libvstdlib.so
#1  0x00007ffff02cfcdd in SMConVarAccessor::Unregister (this=0x7ffff0500f50 <g_SMConVarAccessor>,
    pCommand=0x7ffff0501030 <meta_local_cmd>) at /mnt/c/Projects/Stuff/metamod/core/provider/console.cpp:146
#2  0x00007ffff02cfc7f in SMConVarAccessor::RemoveMetamodCommands (this=0x7ffff0500f50 <g_SMConVarAccessor>)
    at /mnt/c/Projects/Stuff/metamod/core/provider/console.cpp:68
#3  0x00007ffff02d044b in BaseProvider::Notify_DLLShutdown_Pre (this=0x7ffff04ff938 <g_Ep1Provider>)
    at /mnt/c/Projects/Stuff/metamod/core/provider/provider_ep2.cpp:256
#4  0x00007ffff02c2a67 in mm_UnloadMetamod () at /mnt/c/Projects/Stuff/metamod/core/metamod.cpp:530
#5  0x00007ffff02de241 in GameDllBridge::Unload (this=0x7ffff04ff940 <mm16_gamedll_bridge>)
    at /mnt/c/Projects/Stuff/metamod/core/gamedll_bridge.cpp:84
#6  0x00007ffff66e3ab9 in ISource2Server::Shutdown (this=0x7ffff63e9de0)
    at /mnt/c/Projects/Stuff/metamod/loader/gamedll.cpp:427

I'll keep looking

stanriders commented 6 years ago

Looks like ICvar::UnregisterConCommand now has one more param, UnregisterConCommand( ConCommandBase *pCommandBase, int iUnknown ) = 0; works. Adding that and commenting out mm_UnloadLibrary(gamedll_lib); in ISource2Server::Shutdown() seems to fix segfaulting at least when plugins don't use some of ICVar functions.

I do understand why unregs might cause segfaults but I really don't understand why unloading gamedll_lib (and only that, unloading metamod doesn't affect shutdown seq) is failing. Have any ideas?

Meanwhile I'll keep reversing vstdlib

psychonic commented 6 years ago

Great catch! It's still not fully making sense since the issue was present back when the Mac binaries still had symbols, where the param types were visible. I also don't see a second param being used/present in the disassembly of the current bin. However, can't argue with results!

As for the mm_UnloadLibrary issue, I'll have to look into it.

stanriders commented 6 years ago

My disassembly looks something like this:

(0x00000001BA80)
int __fastcall CCvar::UnregisterConCommand(__int64 pCommandBase, __int64 iUnknown)
{
  __int64 v2; // rbx
  int result; // eax
  __int64 v4; // rdx
  unsigned __int8 *v5; // rax

  v2 = iUnknown;
  result = (*(__int64 (__fastcall **)(__int64))(*(_QWORD *)iUnknown + 80LL))(iUnknown);
  if ( (_BYTE)result )
  {
    pthread_mutex_lock((pthread_mutex_t *)(pCommandBase + 208));
    *(_BYTE *)(iUnknown + 16) = 0;
    v4 = *(_QWORD *)(pCommandBase + 64);
    if ( v4 )
    {
      if ( iUnknown == v4 )
      {
        *(_QWORD *)(pCommandBase + 64) = *(_QWORD *)(iUnknown + 8);
LABEL_10:
        *(_QWORD *)(iUnknown + 8) = 0LL;
        v5 = (unsigned __int8 *)sub_1AE80(pCommandBase + 72, iUnknown);
        if ( v5 != (unsigned __int8 *)-1LL )
          sub_1BA10((_QWORD *)(pCommandBase + 72), v5);
      }
      else
      {
        while ( 1 )
        {
          iUnknown = *(_QWORD *)(v4 + 8);
          if ( !iUnknown )
            break;
          if ( v2 == iUnknown )
          {
            *(_QWORD *)(v4 + 8) = *(_QWORD *)(iUnknown + 8);
            goto LABEL_10;
          }
          v4 = *(_QWORD *)(v4 + 8);
        }
      }
    }
    result = pthread_mutex_unlock((pthread_mutex_t *)(pCommandBase + 208));
  }
  return result;
}
0x00007fffefbf0321      217             g_pCVar->UnregisterConCommand(&test11_command, 0);
(gdb) si
0x00007ffffcb5ba80 in ?? () from /mnt/c/Projects/Stuff/dota2/game/bin/linuxsteamrt64/libvstdlib.so

vstdlib starting address is 0x7FFFFCB40000

psychonic commented 6 years ago

The first param passed should be the this ptr (g_pCVar). The second param is the ConCommandBase ptr.

In the disassembly there, near the start, you can see that if it was actually using 0/NULL as iUnknown, it would be segfaulting on this line. result = (*(*iUnknown + 80LL))(iUnknown);

stanriders commented 6 years ago

Hmm, you're right. Why does it work then?

psychonic commented 6 years ago

That's the mystery!

stanriders commented 6 years ago

Hm, I found something confusing. If you change everything related to convar flags in convar.cpp/.h (tier1) from int to int64 it also stops segfaults from happening. Although unloading gamedll_lib still fails.

psychonic commented 6 years ago

That's indeed confusing, but I'll have to try it out.

The main ABI points should be the virtual functions and the class members, all which do all use int64 for flags as far as I can tell. The rest should not matter, although it's probably still good idea to update them regardless for consistency.

psychonic commented 6 years ago

It still crashed for me after updating those. I did look further into the actual crash however, and it's in CCvar::RemoveSplitScreenConVars, suggesting that there still is an issue with flags somewhere (such as a ConVar that MM:S creates mistakenly getting the FCVAR_SS set on it, possibly due to uninitialized memory in flags.

psychonic commented 6 years ago

I updated the flags var in all usages I could find in the latest commit. As a sanity check, could you compare to your files to see if I missed anywhere? I'm still not yet convinced it's the fix for this issue, but needed to be done regardless.

stanriders commented 6 years ago

You've missed all ConVar_Register's and \tier1\convar.cpp#40, \tier1\convar.cpp#1150. At least that's how it is in my files. It's not a proper fix because it still fails if you keep unloading gamedll_lib on shutdown, although so far it worked well without unloading for me

psychonic commented 6 years ago

Ah ha! Thank you. That makes sense and does explain the crash perfectly.

I can confirm the other issue now as well and will look into that.

stanriders commented 6 years ago

Can you explain why that happens? I'm curious what exactly goes wrong there

psychonic commented 6 years ago

I'm not sure yet.

It's possible that in Source 2, some additional function from server.dll/so gets called after IAppSystem::Shutdown.

psychonic commented 6 years ago

I'm probably about to give up on this for the day, but the issue does seem like it could still be related to ConVars.

On Windows (using the win32 build), this is the stack trace I'm getting.

    23a20cd0()  Unknown
    [Frames below may be incorrect and/or missing]  
    server.dll!2230f9cd()   Unknown
    server.dll!223104b6()   Unknown
    server.dll!2231063f()   Unknown
    server.dll!21ddb7f7()   Unknown
    server.dll!21a0cf92()   Unknown
    server.dll!223291aa()   Unknown
    server.dll!21d16a18()   Unknown
    server.dll!21507636()   Unknown
    engine2.dll!0fb04108()  Unknown
    engine2.dll!0fafdd52()  Unknown
    server.dll!222484a4()   Unknown
    server.dll!218f89e6()   Unknown
>   server.dll!ISource2Server::Shutdown() Line 442  C++
    engine2.dll!0fb8ce51()  Unknown
    engine2.dll!0fb8a3c4()  Unknown
    engine2.dll!0faf7d0c()  Unknown
    engine2.dll!0fafaf71()  Unknown
    dota2.exe!00e61287()    Unknown
    dota2.exe!00e618d3()    Unknown
    [External Code] 

The game's server.dll is at 21480000 - 23438000, and the top address in the stack isn't in any module. The next function down is ConVar::ChangeStringValue. I'm not sure yet what to make of any of that, but thought I'd share the info I have so far.

Edit: looks like it's crashing on calling a change callback. Not sure if ConVar-local or Global ones.

psychonic commented 6 years ago

I realized that in one of the plugins I had loaded, there was a global ConVar change callback installed (and maybe you do too?). If I remove that, it's back to crashing in CCvar::RemoveSplitScreenConVars again.

>   vstdlib.dll!5afc89d7()  Unknown
    [Frames below may be incorrect and/or missing, no symbols loaded for vstdlib.dll]   
    host.dll!231bc0c1() Unknown
    engine2.dll!0f70ce51()  Unknown
    engine2.dll!0f70a3c4()  Unknown
    engine2.dll!0f677d0c()  Unknown
    engine2.dll!0f67af71()  Unknown
    dota2.exe!00e61287()    Unknown
    dota2.exe!00e618d3()    Unknown
    [External Code] 
stanriders commented 6 years ago

Yes, I had global change callback as well but I've disabled it while I was looking at cvar crashes.

psychonic commented 6 years ago

With the global change callback disabled, do you still get a crash in vstdlib at shutdown? (Presumably the same CCvar::RemoveSplitScreenConVars crash I see)

stanriders commented 6 years ago

Yes I do

stanriders commented 6 years ago

Any progress?

psychonic commented 6 years ago

Not really, sorry. Got sidetracked with other things. I'll try to get back to it later this week.