daniel-schuermann / mesa

Mesa 3D graphics library (mirror; no pull requests here please)
http://mesa3d.org
135 stars 3 forks source link

Detroit Become Human issues with ACO #171

Closed Oschowa closed 4 years ago

Oschowa commented 4 years ago

Detroit Become Human - both demo and full game - crashes on start-up with ACO enabled, it runs fine with RADV/LLVM, AMDVLK and AMDVLK-pro (although there is some corruption with LLVM on RADV and AMDVLK). On a debug build, it hits an assertion in ACO code, i'll attach the output and a SPIRV dump. Unfortunately, trying to record a renderdoc capture leads to an Denuvo 24h ban, so i couldn't record one. The Demo is free on the Epic Games Store, and should work with lutris.

System information:

spirv-dump.txt

debug.txt

Screenshot from 2019-12-30 11-59-30

tannisroot commented 4 years ago

To get the game running easily, I suggest using Lutris: https://lutris.net/games/detroit-become-human/ (the wine build Lutris uses for this game already contains patches to make it launch and not run at 1 frame per minute). Note that it's one of those special games with DRM that will ban you for 24h if you dare to change your system configuration (like Mesa version or Wine version).

Oschowa commented 4 years ago

I recorded a gfxreconstruct capture, which reproduces the crash/assertion with ACO when replayed: https://drive.google.com/open?id=1ojcfd0DcdJt4QF61aXR4vkpn_a3Uj1rl

valters-tomsons commented 4 years ago

wine: Unhandled page fault on read access to 000000000000000C at address 00007FF3946B272A (thread 00db)

Same with RX 590.

Oschowa commented 4 years ago

@pendingchaos I tested https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257 on top of mesa master, and with this the crash is gone, but the game allocates tons of memory until it is eventually killed. This does not happen with RADV_PERFTEST=llvm

Here is a longer gfxreconsturct capture, which reproduces the memory allocation issue for me: https://drive.google.com/open?id=1vAaQ7ISAo07CW9YZftQtUqxPug6PIFvk

Oschowa commented 4 years ago

A debug build of radv with the above MR applied hits an assertion before running out of memory:

mesa/src/amd/compiler/aco_spill.cpp:611: aco::RegisterDemand aco::{anonymous}::init_live_in_vars(aco::{anonymous}::spill_ctx&, aco::Block*, unsigned int): Assertion `!partial_spills.empty()' failed.
daniel-schuermann commented 4 years ago

@Oschowa is this issue still present?

Oschowa commented 4 years ago

The original issue has been fixed, however I'm now getting gpu hangs on master when playing the Demo:

[ 1720.199397] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
[ 1725.319216] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!

This didn't used to happen when I was testing the WIP MR's to fix the crashing issue earlier, but now I haven't found a commit that works. I tested a build of Mesa right after !3257 was merged fwiw. I also tried to apply !3388 to master, but no changes.

pendingchaos commented 4 years ago

Changing this issue to be about a hang/corruptions, since that seems to be what happens for me now (at the very beginning of the demo). Polaris seems to hang while Navi has corruptions. I don't know yet if the two are related

pendingchaos commented 4 years ago

Hang should be fixed by https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4004

The corruptions still exist though and the hang turned out to be specific to pre-GFX10, so the two are probably not related

pendingchaos commented 4 years ago

I think all ACO-specific issues with this game are fixed: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3063