Scags / TF2Classic-Tools

Basic tools for TF2Classic dedicated server development
7 stars 11 forks source link

Segfault on startup with TF2C.smx #6

Closed vintagepc closed 3 years ago

vintagepc commented 3 years ago

A bit of a weird one here... and yes, I did a fresh pull of the .smx and gamedata before reporting this time :wink:

Deploying a new server and getting a segmentation fault at startup just after the map cycle file:

Using map cycle file 'mapcycle27015.txt'.  ('cfg/mapcycle27015.txt' was not found.)
Segmentation fault

Moving tf2c.smx out of the sourcemod plugins directory allows the server to start normally. Loading it after the server has started seems not to cause any issues until a mapchange (then it crashes).

GDB coredump backtrace:

#0  0x0ab908e0 in ?? ()
#1  0xebd3547d in ?? () from /home/gamecreate/hlserver/tf2classic/bin/server_srv.so
#2  0xebd35d02 in ?? () from /home/gamecreate/hlserver/tf2classic/bin/server_srv.so
#3  0xebcacda5 in ?? () from /home/gamecreate/hlserver/tf2classic/bin/server_srv.so
#4  0xec1b3f32 in ?? () from /home/gamecreate/hlserver/tf2classic/bin/server_srv.so
#5  0xec273455 in ?? () from /home/gamecreate/hlserver/tf2classic/bin/server_srv.so
#6  0xec2711f7 in ?? () from /home/gamecreate/hlserver/tf2classic/bin/server_srv.so
#7  0xec259420 in ?? () from /home/gamecreate/hlserver/tf2classic/bin/server_srv.so
#8  0xebfc0ade in ?? () from /home/gamecreate/hlserver/tf2classic/bin/server_srv.so
#9  0xebf98bcf in ?? () from /home/gamecreate/hlserver/tf2classic/bin/server_srv.so
#10 0xe77de701 in __SourceHook_MFHCls_SGD_LevelInit::Func(char const*, char const*, char const*, char const*, bool, bool) ()
   from /home/gamecreate/hlserver/tf2classic/addons/metamod27015/bin/metamod.2.tf2.so

I've removed all nonstock plugins to confirm it's not an issue with one of those using a tf2c callback incorrectly. All that remains is:

``` sm version SourceMod Version Information: SourceMod Version: 1.10.0.6460 SourcePawn Engine: 1.10.0.6460, jit-x86 (build 1.10.0.6460) SourcePawn API: v1 = 5, v2 = 12 Compiled on: Jan 7 2020 13:39:09 Built from: https://github.com/alliedmodders/sourcemod/commit/2896435 Build ID: 6460:2896435 http://www.sourcemod.net/ meta version Metamod:Source Version Information Metamod:Source version 1.11.0-dev+1128 Plugin interface version: 16:14 SourceHook version: 5:5 Loaded As: Valve Server Plugin Compiled on: Mar 28 2019 17:00:56 Built from: https://github.com/alliedmodders/metamod-source/commit/98d0c0f Build ID: 1128:98d0c0f http://www.metamodsource.net/ admin-flatfile.smx adminhelp.smx adminmenu.smx antiflood.smx basechat.smx basecommands.smx basecomm.smx basetriggers.smx basevotes.smx clientprefs.smx funcommands.smx funvotes.smx playercommands.smx sounds.smx tf2c.smx ```

Only non-stock extensions are curl, dhooks, and smbz2. Note the entire mm/sm setup is a copy from other servers that do not seem to have this issue.

Scags commented 3 years ago

Strange. This is only happening with a single server right? Not any of the other ones?

At a glance it appears to be a map issue, but that doesn't explain why the server only crashes with the plugin installed. I can't do much with that trace since it doesn't provide an exact memory offset from the binary (i.e. it doesn't tell me where to look). The __SourceHook_ most likely isn't affecting it too.

Try installing Accelerator and wait for the server to crash again, then I'll have a better idea as to what's really going on.

vintagepc commented 3 years ago

Yes, single server that is (well, should be) configured identically to others that don't have this issue. I can provide the actual srcds core dump, (though it's 360 mb) if that helps.

I'll give accelerator a try when I have some time and the server's empty; as it is the game data alone seems to restore most of the functionality we need that's generic and not tf2-specific.

vintagepc commented 3 years ago

OK - I've got a crash dump; 53KV-V4UM-SCJ3

Let me know which steam ID to share it with, if any.

Scags commented 3 years ago

Sorry it's taken me a bit to get back to you, been a bit busy. I looked at the crash dump and I have no idea what's wrong. It still dumbfounds me how this is only happening on a single server and that the Tools plugin is somehow responsible.

You could try a fresh install of the game I suppose.

vintagepc commented 3 years ago

FWIW still seeing this with latest and 2.0.2 patch... all servers on the VPS in question seem to be affected by this; but the game directory is copied from a functional server. Can confirm both have identical sourcemod/metamod versions.

This suggests it's an OS issue, but that is also strange as previously we had working servers on essentially the same OS and that has only diverged more recently with system upgrades.

vintagepc commented 3 years ago

Am also seeing this issue on a second new VPS that was independently configured from the first one having the issue. I wonder if this is somehow caused by an issue with CPU virtualization configurations by this provider.

vintagepc commented 3 years ago

I could certainly arrange some hands-on time for you with one of the server instances in question if you think it would be insightful

vintagepc commented 3 years ago

@Scags I sat down with this today and did some crude print debugging. I've narrowed the segfault down to
DHookEnableDetour(hook, true, CTeamPlayRoundBasedRules_SetInWaitingForPlayersPost);

Commenting this line stops the crash, and leaving it enabled with guard PrintToServers() in the actual callback shows they do not appear to get executed. It's seems to be the detour invocation itself causing the issue somehow (which explains the crash timing - server startup or on map change.

Going to keep picking at this time permitting, but I'll continue to follow up here with any new discoveries in the hopes it might trigger some insight or offer new avenues to explore

Scags commented 3 years ago

Looks like the signatures were wrong, although they were still valid signatures. Fixed now.

vintagepc commented 3 years ago

Hah, I was just looking at some crashdumps and wondering why the crash location had awfully similar signatures but did not look like a proper function entrypoint. I'll give the updated files a try.

vintagepc commented 3 years ago

Awesome, I no longer get a segfault now.

Weird it worked fine on two installs but did not on two others.

vintagepc commented 3 years ago

Hmm... segfault on class selection now when joining. Digging....

vintagepc commented 3 years ago

WHJR-HMUG-5BVO in case you get a chance to look before I do

vintagepc commented 3 years ago

OK, got a little further. I think something is very janky with Detour (DHooks 2.2.0-detours16 and 14a, SM1.10, also seen on 1.11)

Based on some logging, the detour Posts are never called, and the last thing on the console before a segfault is always a the non-Post version of a detour call. I spent some time enabling/disabling different ones and this will change the crash address.

Disabling all detours does not appear to precipitate any more segfaults. No plugins loaded other than stock SM plugins and tf2c, so this should not be an issue with something else making use of exposed natives via TF2C.

vintagepc commented 3 years ago

You're off the hook, good Sir. We've confirmed it's an issue with Detour/DHooks at this point.

Thanks again for the quick turnaround on the signature fix.