multitheftauto / mtasa-blue

Multi Theft Auto is a game engine that incorporates an extendable network play element into a proprietary commercial single-player game.
https://multitheftauto.com
GNU General Public License v3.0
1.38k stars 424 forks source link

Discord Rich crash #1349

Closed Dutchman101 closed 4 years ago

Dutchman101 commented 4 years ago

Describe the bug There are 2 new crashes caused by Discord Rich implementation (or recent changes, i suspect a4f447a since i reproduced it twice on this build - MTA can barely be ran and is very unstable)

To reproduce Launch MTA, and idle in the server browser (i crashed on offset 0017e67d) or join a server (i crashed on offset 0017e4c0 after a short while)

Version Client: 1.5.7-r20492

Additional context Dumptrace:

Crash 1 (offset 0017e67d) ``` eax=00000000 ebx=16a05820 ecx=776ccae0 edx=00000000 esi=16a05824 edi=0612fd00 eip=7039e67d esp=0612fc68 ebp=0612fcf4 iopl=0 nv up ei pl nz na pe nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010206 core!discord::Core::SetLogHook+0x1ad: 7039e67d 8b4008 mov eax,dword ptr [eax+8] ds:002b:00000008=???????? EXCEPTION_RECORD: (.exr -1) ExceptionAddress: 7039e67d (core!discord::Core::SetLogHook+0x000001ad) ExceptionCode: c0000005 (Access violation) ExceptionFlags: 00000000 NumberParameters: 2 Parameter[0]: 00000000 Parameter[1]: 00000008 Attempt to read from address 00000008 PROCESS_NAME: gta_sa.exe READ_ADDRESS: 00000008 ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s. EXCEPTION_CODE_STR: c0000005 EXCEPTION_PARAMETER1: 00000000 EXCEPTION_PARAMETER2: 00000008 STACK_TEXT: 0612fcf4 7027b603 00000003 70402000 7027a500 core!discord::Core::SetLogHook+0x1ad 0612fdb4 7027a657 00000000 4f97db19 760e94f0 core!CDiscordManager::Reconnect+0x173 0612feec 7027a5db 01d71190 01d22690 0612ff38 core!CDiscordManager::DoPulse+0x57 0612fefc 74945765 01badf90 749456f0 01d226b0 core!CDiscordManager::DiscordThread+0x2b 0612ff38 749488e9 089c1150 4b44a30c 74948891 pthread!ptw32_threadStart+0x75 0612ff70 760e6739 01d71190 760e6720 0612ffdc pthread!thread_start+0x58 0612ff80 776cb640 01d71190 48a8033a 00000000 kernel32!BaseThreadInitThunk+0x19 0612ffdc 776cb60a ffffffff 776f1119 00000000 ntdll!__RtlUserThreadStart+0x2f 0612ffec 00000000 74948891 01d71190 00000000 ntdll!_RtlUserThreadStart+0x1b FAULTING_SOURCE_LINE: C:\TeamCity\buildAgent\work\675e5b8e8f135823\vendor\discordgsdk\cpp\core.cpp @ 70 FAILURE_BUCKET_ID: NULL_CLASS_PTR_READ_c0000005_core.dll!discord::Core::SetLogHook ```
Crash 2 (offset 0017e4c0) ``` eax=00000000 ebx=01b1e9a8 ecx=00000000 edx=00000000 esi=01b1ef64 edi=760e94f0 eip=7039e4c0 esp=08e2fdbc ebp=08e2feec iopl=0 nv up ei pl zr na pe nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010246 core!discord::Core::RunCallbackscore!discord::Core::RunCallbacks: 7039e4c0 8b01 mov eax,dword ptr [ecx] ds:002b:00000000=???????? Resetting default scope EXCEPTION_RECORD: ExceptionAddress: 7039e4c0 (core!discord::Core::RunCallbacks) ExceptionCode: c0000005 (Access violation) ExceptionFlags: 00000000 NumberParameters: 2 Parameter[0]: 00000000 Parameter[1]: 00000000 Attempt to read from address 00000000 PROCESS_NAME: gta_sa.exe READ_ADDRESS: 00000000 ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s. EXCEPTION_CODE_STR: c0000005 EXCEPTION_PARAMETER1: 00000000 EXCEPTION_PARAMETER2: 00000000 STACK_TEXT: 08e2fdb8 7027a8db 62ae9732 760e94f0 01b1e9a8 core!discord::Core::RunCallbacks 08e2feec 7027a5db 085c8fd0 01ca9f98 08e2ff38 core!CDiscordManager::DoPulse+0x2db 08e2fefc 74945765 01b1e9a8 749456f0 01ca9fb8 core!CDiscordManager::DiscordThread+0x2b 08e2ff38 749488e9 086c2688 667fd371 74948891 pthread!ptw32_threadStart+0x75 08e2ff70 760e6739 085c8fd0 760e6720 08e2ffdc pthread!thread_start+0x58 08e2ff80 776cb640 085c8fd0 65908809 00000000 kernel32!BaseThreadInitThunk+0x19 08e2ffdc 776cb60a ffffffff 776f1129 00000000 ntdll!__RtlUserThreadStart+0x2f 08e2ffec 00000000 74948891 085c8fd0 00000000 ntdll!_RtlUserThreadStart+0x1b FAULTING_SOURCE_LINE: C:\TeamCity\buildAgent\work\675e5b8e8f135823\vendor\discordgsdk\cpp\core.cpp @ 52 FAULTING_SOURCE_LINE_NUMBER: 52 FAILURE_BUCKET_ID: NULL_POINTER_READ_c0000005_core.dll!discord::Core::RunCallbacks ```

Dumps: https://upload.mtasa.com/u/916651105/dumps.zip_ (can only be accessed by MTA team, which isn't a problem anyways since contributors won't have release symbols either.. assuming that @0x416c69 will be working on this issue, you will really need to reproduce it.. or judge by the stack trace i provided. I reproduced it constantly.. should be pretty easy, the latest nightly is really, really unstable..)

Note: discussion of this issue in #development started after this message: https://discordapp.com/channels/278474088903606273/366384007535001612/696624285678043136

Dutchman101 commented 4 years ago

It's no longer instantly crashing for me on either of the offsets, but according to crash stats some users still crash on the offset that peaked yesterday (below is from r20496, client.dll @ 0017e4c0):

Dumptrace ``` CONTEXT: eax=00000000 ebx=01aa0a00 ecx=00000000 edx=00000000 esi=01aa0fbc edi=74cd9010 eip=6ff5e4c0 esp=046cfdbc ebp=046cfeec iopl=0 nv up ei pl zr na pe nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010246 6ff5e4c0 8b01 mov eax,dword ptr [ecx] ds:002b:00000000=???????? Resetting default scope EXCEPTION_RECORD: ExceptionAddress: 6ff5e4c0 (core!discord::Core::RunCallbacks) ExceptionCode: c0000005 (Access violation) ExceptionFlags: 00000000 NumberParameters: 2 Parameter[0]: 00000000 Parameter[1]: 00000000 Attempt to read from address 00000000 PROCESS_NAME: gta_sa.exe READ_ADDRESS: 00000000 ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s. EXCEPTION_CODE_STR: c0000005 SYMBOL_NAME: core!discord::Core::RunCallbacks+0 MODULE_NAME: EXCEPTION_PARAMETER2: 00000000 STACK_TEXT: 046cfdb8 6fe3a8e4 72c94fa8 74cd9010 01aa0a00 core!discord::Core::RunCallbacks 046cfeec 6fe3a5db 01a9fbd0 01aa4a58 046cff38 core!CDiscordManager::DoPulse+0x2e4 046cfefc 70a05765 01aa0a00 70a056f0 01aa4a78 core!CDiscordManager::DiscordThread+0x2b 046cff38 70a088e9 01ad0190 023d5cec 70a08891 pthread!ptw32_threadStart+0x75 046cff70 74cd6359 01a9fbd0 74cd6340 046cffdc pthread!thread_start+0x58 046cff80 77297b74 01a9fbd0 a591f552 00000000 kernel32!BaseThreadInitThunk+0x19 046cffdc 77297b44 ffffffff 772b8f24 00000000 ntdll!__RtlUserThreadStart+0x2f 046cffec 00000000 70a08891 01a9fbd0 00000000 ntdll!_RtlUserThreadStart+0x1b FAULTING_SOURCE_LINE: C:\TeamCity\buildAgent\work\675e5b8e8f135823\vendor\discordgsdk\cpp\core.cpp @ 52 SYMBOL_NAME: core!discord::Core::RunCallbacks+0 IMAGE_NAME: core.dll FAILURE_BUCKET_ID: NULL_POINTER_READ_c0000005_core.dll!discord::Core::RunCallbacks ```

It's not easy to explain this, but since I have knownledge on nightly build distribution rates (and the speed these crashes appear after lifting the nightly channel) i can safely say this:

But now, just 3 minutes after briefly lifting the nightly auto-update channel, someone had the exact same crash.. we won't be collecting further stats though, because the lift was reverted for this reason (it could become worse again in a matter of hours, if the crash risk again is elevated) but mostly due to another crash I posted in #development - if neccesary, we can raise nightly and collect stats after said disconnect crash is resolved by LopSided - but this "3 minutes after nightly push" thing probably means you can speculate it's not as isolated as that "single victim on r20488", and it would grow if we kept updating nightly users. In that case, the crash risk is still elevated.

So it could be like this: 1) After the initial Discord Rich integration, crash risk was 1% (maybe border case like mentioned earlier)

2) After a4f447a the crash risk was 99% (extremely unstable)

3) After 3d92acd the crash risk got reduced to unknown (but here, instant crashing has stopped)

I mean, maybe you just need to give it a deeper thought: why can this happen.. even should account for the 'odd' case, no one should really crash like this. I read your comment in #dev that it's a point something isn't initialized yet.. so then think: when can the same scenario apply?

I'm just afraid that Discord Rich integration will be the same type of bottleneck to MTA stability as CEF has been for a long time.. I am also seeing various other crashes on Discord module and core.dll (similar offsets), they are for later and I don't have time to debug and report them all. But i am afraid of stability yeah. Maybe needs a general safety/checks in code review.

@0x416c69