alliedmodders / sourcepawn

A small, statically typed scripting language.
Other
357 stars 64 forks source link

Async JIT execution leads to crashes #840

Open Wend4r opened 1 year ago

Wend4r commented 1 year ago

Hello, I would like to see support in SourcePawn for executing JIT OP-Code in several threads at the same time.

For example, when using sv_parallel_send <numthreads> that executes SetTransmit in different threads (and in SourceMod), server does crashes image

I also need my upcoming PR to SourceMod with async

native void Call_StartFunction(Handle plugin, Function func, bool async = false);

(to make huge hierarchical miscalculations with the Navigation Meshes of the map without loading the main game thread (tick delays in milliseconds/aka server var) :P)

Wend4r commented 1 year ago
  sub_320880(
    &dword_76F380,
    "sv_parallel_send",
    "0",
    0x80000,
    "Pack and send snapshots in parallel for smoother server tick rate at the expense of spending more CPU.",
    sub_1AF5F0);

sv_parallel_send description also talks about unloading the main game thread

Wend4r commented 1 year ago

Also I remember that IMemAlloc::Alloc, IMemAlloc::Realloc and IMemAlloc::Free can be called from different threads that can hooks SourceMod DHooks. I had to rewrite everything to SourceMod C++ Extension 😠

Wend4r commented 1 year ago

Example crash details

Stack trace Function | Offset -- | -- sourcepawn.jit.x86.so!sp::PoolScope::PoolScope() | 0x2a sourcepawn.jit.x86.so!sp::CompilerBase::CompilerBase(sp::PluginRuntime*, sp::MethodInfo*) | 0x42 sourcepawn.jit.x86.so!sp::Compiler::Compiler(sp::PluginRuntime*, sp::MethodInfo*) | 0x17 sourcepawn.jit.x86.so!sp::CompilerBase::Compile(sp::PluginContext*, ke::RefPtr, int*) | 0x2d sourcepawn.jit.x86.so!sp::Environment::Invoke(sp::PluginContext*, ke::RefPtr const&, int*) | 0x1ae sourcepawn.jit.x86.so!sp::PluginContext::Invoke(unsigned int, int const*, unsigned int, int*) | 0x208 sourcepawn.jit.x86.so!sp::ScriptedInvoker::Invoke(int*) | 0x260 sourcepawn.jit.x86.so!sp::ScriptedInvoker::Execute(int*) | 0x5f sourcemod.logic.so!CForward::Execute(int*, SourceMod::IForwardFilter*) | 0x1d4
Frame context ``` Thread 7 (crashed): 0: sourcepawn.jit.x86.so!sp::PoolScope::PoolScope() + 0x2a eip: 0xf07745ca esp: 0xe2bfdbf0 ebp: 0xe2bfdc08 ebx: 0x0fbc5f20 esi: 0xe2bfdc7c edi: 0x00000000 eax: 0x00000000 ecx: 0xe2bffbd4 edx: 0x00000000 efl: 0x00010246 f07745b9 74 0d jz 0xf07745c8 f07745bb a1 4c e7 7c f0 mov eax, [0xf07ce74c] f07745c0 89 04 24 mov [esp], eax f07745c3 e8 48 22 73 07 call 0xf7ea6810 f07745c8 89 06 mov [esi], eax > f07745ca ff 40 08 inc dword [eax+0x8] f07745cd 8b 40 04 mov eax, [eax+0x4] f07745d0 85 c0 test eax, eax f07745d2 74 03 jz 0xf07745d7 f07745d4 8b 78 04 mov edi, [eax+0x4] f07745d7 89 7e 04 mov [esi+0x4], edi e2bfdbf0 01 00 00 00 77 00 00 00 84 f8 04 08 a2 b0 c0 f7 ....w........... e2bfdc00 68 dc bf e2 00 1b 40 d6 28 dc bf e2 12 64 77 f0 h.....@.(....dw. Found via instruction pointer in context ```

sourcepawn.jit.x86.zip

dvander commented 1 year ago

There is no thread safety model for SourcePawn data structures, and it'd be very difficult to define and implement one efficiently. And I'm opposed to a global interpreter lock, which defeats the point of threading. So really the only path forward is an isolation model, where separate threads do not share any resources except those that can be sent over pipes. Think something like Web Workers.

Even if we build that out, SourceMod is completely thread unsafe. Every single native would have to be audited, and the set of natives available would have to be restricted off the main thread. Otherwise it'd be a free-for-all and we'd get bug reports every day about how random thing X crashes 10% of the time.

tl;dr: a "proper" solution is not coming anytime soon. Instead, I think spot-fixes are needed for safety here. Anywhere a SourceHook callback could be called off the main thread, we should check and bypass interactions with main-thread plugins (and use mutexes if main thread data structures are at risk).

Your solution of using an extension or MM:S plugin is the right way, IMO.

Wend4r commented 1 year ago

Another case with asynchronous SDKHook_SetTransmit

Stack trace
Function
0 libc-2.31.so + 0x14dd90
1 sourcepawn.jit.x86.so!sp::ScriptedInvoker::Invoke(int*) + 0x1a7
2 sourcepawn.jit.x86.so!sp::ScriptedInvoker::Execute(int*) + 0x5f
3 sdkhooks.ext.2.csgo.so!SDKHooks::Call(CBaseEntity*, SDKHookType, CBaseEntity*) + 0x104
4 sdkhooks.ext.2.csgo.so!SDKHooks::Hook_SetTransmit(CCheckTransmitInfo*, bool) + 0x55
5 sdkhooks.ext.2.csgo.so!__SourceHook_MFHCls_SetTransmit::CMyDelegateImpl::Call(CCheckTransmitInfo*, bool) + 0x2d
6 sdkhooks.ext.2.csgo.so!__SourceHook_MFHCls_SetTransmit::Func(CCheckTransmitInfo*, bool) + 0x115
7 server.so + 0x6d66c3
8 engine.so + 0x1cc4a3
9 engine.so + 0x1d4aba
10 engine.so + 0x1d4d19
11 engine.so + 0x1d56bc
12 engine.so + 0x1d4db5
13 engine.so + 0x1d4df1
14 engine.so + 0x1d4e77
15 libvstdlib.so + 0x1a679
16 libtier0.so!CThread::ThreadProc(void*) + 0x97
17 libpthread-2.31.so!start_thread [pthread_create.c:477 + 0x3]
18 libc-2.31.so!__clone + 0x66
19 0xc68ffb40
Fyren commented 1 year ago

Reporting stacks is not helping anything. We've always known SP doesn't support running concurrently.

dvander commented 1 year ago

This is an sdkhooks bug not SP.