ValveSoftware / wine

Wine with a bit of extra spice
Other
1.23k stars 236 forks source link

minidumpapiset.h:MiniDumpWriteDump is crashing and not dumping anything #250

Open nwx-vidaks opened 2 weeks ago

nwx-vidaks commented 2 weeks ago

Observed on SteamDeck: Proton Version: 9.0-2 and 8.0-5 Wine Versions: wine-9.0 and wine-8.0

Summary Our game is built on Unreal Engine 5.4 and compiled for Win64. It uses Unreal's default crash reporting mechanism with no changes to the engine code. During crash handling Unreal makes a call to MiniDumpWriteDump which is a windows built-in API. On native Windows, this works fine as expected and we get minidump files. On SteamDeck, after triggering an intentional crash of the game client, the subsequent crash dump api call crashes immediately, preventing Unreal from launching a crash reporter client and sending crash logs up to our log collector. Attached is the very start of the proton log and then the tail end starting from where the explicit crash is requested. The game was launched with the following args in steam: PROTON_DUMP_DEBUG_COMMANDS=1 PROTON_LOG=1 PROTON_LOG_DIR=~ PROTON_CRASH_REPORT_DIR=~ %command%

I wasn't able to get symbolicated logs from winedbg, any help would be appreciated. Thank you! debug_log.txt

nwx-vidaks commented 2 weeks ago

Looks like it could be related to this wine bug: https://bugs.winehq.org/show_bug.cgi?id=56932

nwx-vidaks commented 1 week ago

Attached is a debug_log with +debghelp,+seh channels turned on. winedebug.log

ivyl commented 1 week ago

Hi, Proton maintainer here. That's an interesting find! Coincidentally I've been recently testing Unreal's crash dumps. I was using the FPS demo and UE 5.4.3 which was working fine. It's likely the extra complexity of real game that makes the difference here.

I'm looking forward to the full log. Is there a chance that you could provide us with a simple reproducer (e.g. stripped down game that just crashes)? That would be a huge help.

nwx-vidaks commented 1 week ago

Thanks for the reply. Correction on the unreal version, observing this on 5.3.2. MiniDumpWriteDump is getting called in WindowsPlatformCrashContext.cpp, and in our case being called by ReportCrashUsingCrashReportClient. It seems as though execution is never handed back to the game process after we enter wine space with the minidump call, so we're left with a hanging client, no crash reporter, and an empty minidump file on disk.

Highly unlikely we can provide a stripped down client. I will look at reproducing with the FPS demo. Just updated the full log, not sure why it didn't upload on the first go.

ivyl commented 1 week ago

I don't see much in the log other than some usual SEH noise, mostly for OutputDebugString() handling.

The only meaningful difference in 5.4.3 seem to be that the MiniDumpWriteDump() is now called with an extra MiniDumpWithUnloadedModules.

The resulting UEMinidump.dmp has ~13MB and I'm able to open it just fine (see: https://github.com/ivyl/Proton/blob/dbg-win/docs/DEBUGGING-WINDOWS.md - that's WIP documentation I've been toying with UE for).

I'll poke at 5.3.2. Let me know if you make any progress on your side.

EPouech commented 1 week ago

from the log, it rather looks like a hang in loading pdb information from main module can you send me the WINEDEBUG=+dbghelp,+dbghelp_msc log? Note: the log main contain sensible information derived from main module's code (basically internal types, internal structure/classes content, function names) Private message is fine. Access to .pdb file would be nice as well, but same remarks as above.

nwx-vidaks commented 1 week ago

Apologies for the delayed reply folks, I was just doing some sanity checks. Thanks for your continue interest. As you can imagine, I can't provide pdb files. I tried a couple of things (corresponding relevant portions of logs attached, with proprietary information censored):

  1. Running with +dbghelp,+dbghelp_msc. debug_msc.txt
  2. Running with +dbghelp,+dbghelp_msc and the pdb removed from the expected path (we don't ship pdbs to end users) debug_msc_nopdb.log

This line in particular makes me wonder if Unreal is competing for a lock with itself (unreal also spawns a separate crash monitor at start up, which could be competing here...just speculation): 019c:err:sync:RtlpWaitForCriticalSection section 00006FFFFFA6DFF8 "../src-wine/dlls/msvcrt/lock.c: LOCKTABLEENTRY.crit" wait timed out in thread 019c, blocked by 0194, retrying (60 sec)

I've also tried a build which simply skips the minidump call altogether. It successfully creates a CrashReportClient process (I can see the proc in windebg) but no GUI appears on the steamdeck.

This would all be much easier to debug if I could get winedbg to load our symbols from the pdb file. Without symbols I can't breakpoint on any of the wine API calls. Any ideas how I might be able get that going?

EPouech commented 5 days ago

Apologies for the delayed reply folks, I was just doing some sanity checks. Thanks for your continue interest. As you can imagine, I can't provide pdb files. no worries, and thanks for taking the time to investigate.

2. Running with `+dbghelp,+dbghelp_msc` and the pdb removed from the expected path (we don't ship pdbs to end users)
   [debug_msc_nopdb.log](https://github.com/user-attachments/files/16919423/debug_msc_nopdb.log)

I can't see any minidump calls into dbghelp for this log; is this intended? but there's in the log 03a8:err:msvcrt:_wassert (L"iff->condition.node->data_type->dimx == 1",L"../src-vkd3d/libs/vkd3d-shader/tpf.c",5240) not sure where this comes from (is this what you're trying to debug or something else?)

This line in particular makes me wonder if Unreal is competing for a lock with itself (unreal also spawns a separate crash monitor at start up, which could be competing here...just speculation): 019c:err:sync:RtlpWaitForCriticalSection section 00006FFFFFA6DFF8 "../src-wine/dlls/msvcrt/lock.c: LOCKTABLEENTRY.crit" wait timed out in thread 019c, blocked by 0194, retrying (60 sec) not 100% sure. a couple of lines above (in first log) there's a buffer overflow which could come into play. this change should get rid of it (if you can't recompile proton I can send you the recompiled dbghelp.dll as a replacement)

diff --git a/dlls/dbghelp/msc.c b/dlls/dbghelp/msc.c
index e4a63544f34..90edd542589 100644
--- a/dlls/dbghelp/msc.c
+++ b/dlls/dbghelp/msc.c
@@ -823,7 +823,7 @@ static BOOL codeview_resolve_forward_type(struct codeview_type_parse* ctp, const
                 ((decoratedref && decorateddecl && !strcmp(decoratedref, decorateddecl)) ||
                  (!decoratedref && !decorateddecl && lenref == lendecl && !memcmp(nameref, namedecl, lenref))))
             {
-                TRACE("mapping forward type %.*s (%s) %x into %x\n", lenref, nameref, debugstr_a(decoratedref), reftype, hl->id);
+                TRACE("mapping forward type %x into %x\n", reftype, hl->id);
                 *impl_type = hl->id;
                 return TRUE;
             }

I've also tried a build which simply skips the minidump call altogether. It successfully creates a CrashReportClient process (I can see the proc in windebg) but no GUI appears on the steamdeck. any assert() in the logs for this case?

This would all be much easier to debug if I could get winedbg to load our symbols from the pdb file. Without symbols I can't breakpoint on any of the wine API calls. Any ideas how I might be able get that going? (assuming running from linux box & Proton as logs show):

  • you need to set WINEPREFIX to the game prefix: $STEAMROOT/steamapps/compatdata/$GAMEID/pfx ($STREAMROOT and $GAMEID should reflect your settings)
  • and run wine from the same wine version as proton is using likely $STEAMROOT/steamapps/common/Proton\ 9.0\ (Beta)/files/bin/wine

so something like: WINEPREFIX=$STEAMROOT/steamapps/compatdata/$GAMEID/pfx $STEAMROOT/steamapps/common/Proton\ 9.0\ (Beta)/files/bin/wine winedbg

then from winedbg prompt:

info proc attach where pid is the pid listed in info proc output (don't forget 0x prefix for the pid which is listed in hex)

this should work (famous last words)

(I assumes that each .pdb file sits in the same directory as the .exe or .dll file)

HTH

ivyl commented 5 days ago
03a8:err:msvcrt:_wassert (L"iff->condition.node->data_type->dimx == 1",L"../src-vkd3d/libs/vkd3d-shader/tpf.c",5240)

not sure where this comes from (is this what you're trying to debug or something else?

This comes from our implementation of d3dcompiler*.dlls, specifically https://gitlab.winehq.org/wine/vkd3d/-/blob/master/libs/vkd3d-shader/tpf.c?ref_type=heads#L5596

To work around this you can try using the redistributable versions of those DLLs. If you can send us the offending shader (can be done privately ahiler at codeweavers com) we can take a look why our impl fails and work on improving it.

EPouech commented 5 days ago

replying to myself:

I can't see any minidump calls into dbghelp for this log; is this intended?

there's no trace inside MiniDumpWriteDump (only fixme:s) so we can't actually tell if this API is called or not :-(

nwx-vidaks commented 5 days ago

Attempted those specific winedbg commands. Wine seems to hang indefinitely when it tries to load symbols from the pdb. The steamdeck also becomes unresponsive. Lethal pdbs 😆 .

(1)(deck@steamdeck ~)$ WINEFSYNC=1 WINEESYNC=1 WINEPREFIX=/home/deck/.local/share/Steam/steamapps/compatdata/<id>/pfx .local/share/Steam/steamapps/common/Proton\ 9.0\ \(Beta\)/files/bin/wine winedbg
wine: using kernel write watches, use_kernel_writewatch 2.
wine: using kernel write watches, use_kernel_writewatch 2.
wine: using kernel write watches, use_kernel_writewatch 2.
wine: using kernel write watches, use_kernel_writewatch 2.
Wine-dbg>info proc
 pid      threads  executable (all id:s are in hex)
 000003a4 1        'start.exe'
 000003b4 1        \_ 'winedbg.exe'
=000003bc 1           \_ 'winedbg.exe'
 000003ac 2        \_ 'conhost.exe'
 00000038 10       'services.exe'
 000000f4 6        \_ 'rpcss.exe'
 000000c0 3        \_ 'svchost.exe'
 000000a0 6        \_ 'plugplay.exe'
 0000006c 10       \_ 'winedevice.exe'
 00000044 8        \_ 'winedevice.exe'
 00000028 1        'start.exe'
 00000120 5        \_ 'msvsmon.exe'
 00000154 5           \_ 'msvsmon.exe'
 0000013c 5           \_ 'msvsmon.exe'
 00000020 3        'steam.exe'
 00000188 109      \_ 'NWXClient.exe'
 000000e0 3        \_ 'explorer.exe'
 00000118 3           \_ 'tabtip.exe'
Wine-dbg>attach 0x00000188
03c0:fixme:dbghelp:elf_search_auxv can't find symbol in module
03c0:fixme:dbghelp:elf_search_auxv can't find symbol in module
0x006ffffff4f435 ntdll+0xf435: ret
Wine-dbg>break ReportCrashUsingCrashReportClient
03c0:fixme:dbghelp_msc:numeric_leaf Unsupported numeric leaf type 800a
...
03c0:fixme:dbghelp_msc:codeview_add_type_array array size should be a multiple of element size 264 80
...
03c0:fixme:dbghelp_msc:numeric_leaf Unsupported numeric leaf type 800a
...
03c0:fixme:dbghelp_msc:codeview_add_type_array array size should be a multiple of element size 128 268
...
##stuck unresponsive on CLI
nwx-vidaks commented 5 days ago

Progress! I took the pdb file out of the expected path, ran the game, and triggered a crash. I finally got a non-empty minidump in the crash dir:

(deck@steamdeck ~)$ ls -lt devkit-game/nwxremotedebug/NWX/Saved/Crashes/UECC-Windows-1B970653465337F4871767B94E92716D_0000/
total 12668
-rw-r--r-- 1 deck deck    37404 Sep  9 11:34 CrashContext.runtime-xml
-rw-r--r-- 1 deck deck     2914 Sep  9 11:34 GameUserSettings.ini
-rw-r--r-- 1 deck deck   209905 Sep  9 11:34 NWX.log
-rw-r--r-- 1 deck deck 12707085 Sep  9 11:34 UEMinidump.dmp
-rw-r--r-- 1 deck deck      160 Sep  9 11:34 CrashReportClient.ini

So that gives credence to EPouech's suspicion:

from the log, it rather looks like a hang in loading pdb information from main module

I'll try to find more detail about our pdb files but as far as I know we don't do anything special to produce them. Just a normal build through Unreal Editor.

Regardless, the issue I'm now having to deal with is that the call to CreateProcess is silently failing. I'll be trying out a UE 5.4 build, as well as some of the debug instructions from the doc you shared (now that I finally have a minidump).

nwx-vidaks commented 5 days ago

Just following https://github.com/ivyl/Proton/blob/dbg-win/docs/DEBUGGING-WINDOWS.md#live-debugging. Do you know how I can get the proton symbols downloaded on SteamDeck? I don't actually have proton listed in the steam library on the SteamDeck.

EPouech commented 4 days ago

from the pdb trace loading above:

03c0:fixme:dbghelp_msc:codeview_fetch_type Couldn't load type 113d15 03c0:fixme:dbghelp_msc:codeview_fetch_type Couldn't load type 113d2f 03c0:fixme:dbghelp_msc:codeview_fetch_type Couldn't load type 113d2f 03c0:fixme:dbghelp_msc:codeview_fetch_type Couldn't load type 113d3a 03c0:fixme:dbghelp_msc:codeview_fetch_type Couldn't load type 113d15 03c0:fixme:dbghelp_msc:codeview_fetch_type Couldn't load type 113d2f 03c0:fixme:dbghelp_msc:codeview_fetch_type Couldn't load type 113d3a

these could be the cause of crash later on

could you please:

(you can erase the type names from the log above, I don't need them)

03c0:fixme:dbghelp_msc:codeview_add_type_array array size should be a multiple of element size 264 80 question: this log entry can come from a variety of causes (but basically there's an consistency between array size and element size of the array). One of them being that the pdb can have several versions for a given struct/class description, and wine's reader doesn't pick up the right one. This happens when the module is linked with MS linker and /INCREMENTAL option, and compiled several times after altering one types. (now the question): is the build that generated the .pdb done incrementally over the same storage for .obj and .exe and .pdb files? to eliminate this case (if possible and if this case applies) erase the .pdb file & relink & retest with new pdb file

ivyl commented 4 days ago

Just following https://github.com/ivyl/Proton/blob/dbg-win/docs/DEBUGGING-WINDOWS.md#live-debugging. Do you know how I can get the proton symbols downloaded on SteamDeck? I don't actually have proton listed in the steam library on the SteamDeck.

You may need to switch the tab at the top to "TOOLS" when performing search. I've just checked on a non-developer mode Steam Deck that has a few games installed and I see Proton installations just fine.

If you figure how to make it show (either by enabling developer mode, downloading some free game or switching tabs) let me know so I can include that in the document.

Also if you have any other issues with the docs I'm also interested - those are WIP as stated before :-)

nwx-vidaks commented 4 days ago

Ok. So I think we're looking at two distinct issues here.

  1. Our pdb causing wine/the steamdeck to hang. (The pdb files are quite large ~3GB).
  2. That shader error is actually causing the crash reporter client process to die
    ...
    0194:trace:process:NtCreateUserProcess L"\\??\\Z:\\home\\deck\\devkit-game\\nwxremotedebug\\Engine\\Binaries\\Win64\\CrashReportClient.exe" pid 040c tid 0410 handles 0x1d8/0x1e0
    ....
    0410:fixme:d3dcompiler:D3DCompile2 Ignoring flags 0x8800.
    0410:err:msvcrt:_wassert (L"iff->condition.node->data_type->dimx == 1",L"../src-vkd3d/libs/vkd3d-shader/tpf.c",5240)
    0410:trace:process:NtTerminateProcess handle (nil), exit_code 3, process_exiting 0.
    0410:trace:process:NtTerminateProcess handle (nil), self 1, process_exiting 0.
    0410:trace:process:NtTerminateProcess handle 0xffffffffffffffff, exit_code 3, process_exiting 1.
    0410:trace:process:NtTerminateProcess handle 0xffffffffffffffff, self 1, process_exiting 1.

    I'm working on collecting that debug info as requested for the pdb issue now. Not sure I'm understanding how to do: /tools/winedump dump -j TPI .pdb > pdb.log Proton 9's binary directory doesn't contain a winedump binary.

w.r.t getting proton debug symbols on the deck -> got it working. Unfamiliarity with the UI at fault.

you can try using the redistributable versions of those DLLs

Mind sharing further instructions on this point?

Thanks for all the help so far folks. 🦸 🦸

ivyl commented 3 days ago

Not sure I'm understanding how to do: /tools/winedump dump -j TPI .pdb > pdb.log Proton 9's binary directory doesn't contain a winedump binary.

Hmm. We may need to change that to make troubleshootings like this easier.

Right now, if you don't have access to a Linux machine where you can install Wine via package manager, the easiest option is to extract winedump from https://steamdeck-packages.steamos.cloud/archlinux-mirror/multilib-3.5/os/x86_64/wine-8.10-1-x86_64.pkg.tar.zst.

you can try using the redistributable versions of those DLLs

Mind sharing further instructions on this point?

Our implementation of d3dcompiler_*.dll resides in system32 directory, they are used if there's no DLL with that name next to the .exe or the redistributables containing it did not override it.

In the log you should see a line like 251366.819:0184:0188:trace:loaddll:build_module Loaded L"C:\\windows\\system32\\D3DCOMPILER_43.dll" at 00006FFFFDC20000: builtin with process id of CrashReportClient.exe.

This should tell you which version (the number) is used and which variant (builtin = Wine's implementation).

A lot of games ship those .dlls along the relevant .exes. IIRC there's an option in UE when creating a build to include redists without the installer?

I'll look into why our shader compiler doesn't like crash reporter, but that will likely take a lot of time to resolve.

Thanks for all the help so far folks. 🦸 🦸

Thanks for reaching out and your patience!

EPouech commented 3 days ago

@nwx-vidaks thanks for taking the time to investigate I'll have a look at the assert (not its cause, but maybe it should be visible to the crash handler somehow)

nwx-vidaks commented 2 days ago

More log info from the shader compile failure:

03a8:trace:loaddll:build_module Loaded L"C:\\windows\\system32\\libvkd3d-shader-1.dll" at 00006FFFF5320000: native
03a8:trace:loaddll:build_module Loaded L"C:\\windows\\system32\\libvkd3d-1.dll" at 00006FFFF53F0000: native
03a8:trace:loaddll:build_module Loaded L"C:\\windows\\system32\\wined3d.dll" at 00006FFFF5470000: builtin
03a8:trace:loaddll:build_module Loaded L"C:\\windows\\system32\\d3dcompiler_47.dll" at 00006FFFF5620000: builtin
03a8:trace:d3dcompiler:D3DCompile data 0000097F373D8880, data_size 878, filename (null), defines 0000000000000000, include 0000000000B57040, entrypoint "Main", target "vs_4_0", sflags 0x8800, eflags 0, shader 0000000000B572F0, error_messages 0000000000B57020.
03a8:trace:d3dcompiler:D3DCompile2 data 0000097F373D8880, data_size 878, filename (null), macros 0000000000000000, include 0000000000B57040, entry_point "Main", profile "vs_4_0", flags 0x8800, effect_flags 0, secondary_flags 0, secondary_data 0000000000000000, secondary_data_size 0, shader_blob 0000000000B572F0, messages_blob 0000000000B57020.
03a8:fixme:d3dcompiler:D3DCompile2 Ignoring flags 0x8800.
03a8:err:msvcrt:_wassert (L"iff->condition.node->data_type->dimx == 1",L"../src-vkd3d/libs/vkd3d-shader/tpf.c",5240)
03a8:trace:process:NtTerminateProcess handle (nil), exit_code 3, process_exiting 0.
03a8:trace:process:NtTerminateProcess handle (nil), self 1, process_exiting 0.
03a8:trace:process:NtTerminateProcess handle 0xffffffffffffffff, exit_code 3, process_exiting 1.
03a8:trace:process:NtTerminateProcess handle 0xffffffffffffffff, self 1, process_exiting 1.
nwx-vidaks commented 2 days ago

Hey folks. Just as a test I copied that specific dll from my local machine onto the deck. It works! I get a gui and everything. So there's definitely something up with wine equivalent.