AsahiLinux / linux

Linux kernel source tree
Other
2.36k stars 90 forks source link

[GPU LOCKUP] Deus Ex: Human Revolution DX11 tessellation #338

Open chadmed opened 1 month ago

chadmed commented 1 month ago

Deus Ex: Human Revolution (Proton Experimental, DX11) causes kernel driver crashes when tessellation is enabled. Visually, the game world loads but over the course of a few seconds rendering begins to degrade. Initially, this is seen as magenta tiles, then incorrectly rendered geometry, then a blank screen and unresponsive system. The game runs basically perfectly with tessellation disabled, modulo Wine-related issues this title has had for over a decade.

asahi-diagnose-20241021-204402.txt

jannau commented 1 month ago

https://gitlab.freedesktop.org/asahi/mesa/-/merge_requests/295 has fixes for indirect tess, worth trying to test.

Would make sense to reference #72

alyssarosenzweig commented 1 month ago

HeapAllocator[File 2974 VM 1 GPU FW Private]::new: Failed to insert node of size 0x400000000 / align 0x8000: ENOSPC

@asahilina is this a kernel bug or a plain OOM?

asahilina commented 1 month ago

OOM, pretty sure that's the good old max layers max size render attempt and userspace didn't allocate enough kernel VM AS for it to work. If you bump that it might even work!

On October 24, 2024 2:45:42 AM GMT+02:00, Alyssa Rosenzweig @.***> wrote:

HeapAllocator[File 2974 VM 1 GPU FW Private]::new: Failed to insert node of size 0x400000000 / align 0x8000: ENOSPC

@asahilina is this a kernel bug or a plain OOM?

-- Reply to this email directly or view it on GitHub: https://github.com/AsahiLinux/linux/issues/338#issuecomment-2433970454 You are receiving this because you were mentioned.

Message ID: @.***>

chadmed commented 1 week ago

Still borked with mesa tag 20241111 (host and FEX rootfs) but... slightly less so? The GPU can recover from the fault now and the game instantly crashes. No more HeapAllocator errors, just a bunch of what look like unmapped mem/OOM issues.

[ 6804.318998] asahi 406400000.gpu:  (\________/) 
[ 6804.319008] asahi 406400000.gpu:   |        |  
[ 6804.319010] asahi 406400000.gpu: '.| \  , / |.'
[ 6804.319012] asahi 406400000.gpu: --| / (( \ |--
[ 6804.319014] asahi 406400000.gpu: .'|  _-_-  |'.
[ 6804.319015] asahi 406400000.gpu:   |________|  
[ 6804.319017] asahi 406400000.gpu: ** GPU timeout nya~!!!!! **
[ 6804.319018] asahi 406400000.gpu:   Event slot: 25
[ 6804.319022] asahi 406400000.gpu:   Timeout count: 0
[ 6804.319023] asahi 406400000.gpu:   Unk: 0
[ 6804.319026] asahi 406400000.gpu:   Fault info: FaultInfo {
                   address: 0x0,
                   sideband: 0x38,
                   vm_slot: 0x36,
                   unit_code: 0xa,
                   unit: IPF(
                       0x0,
                   ),
                   level: 0x2,
                   unk_5: 0x0,
                   read: true,
                   reason: Unmapped,
               }
[ 6804.319038] asahi 406400000.gpu:   Pending events:
[ 6804.319039] asahi 406400000.gpu:     [0:21] flags=13 value=0x1557b200
[ 6804.319042] asahi 406400000.gpu:     [1:25] flags=7 value=0x1955a800
[ 6804.319045] asahi 406400000.gpu:   Halt count: 1
[ 6804.319047] asahi 406400000.gpu:   Halted: 1
[ 6804.319049] asahi 406400000.gpu:   Attempting recovery...
asahilina commented 1 week ago

@alyssarosenzweig Don't know if that's the problem at this point, but do we want to investigate if we can have a special path for vertex-only passes without a giant tile buffer? If Mesa can know this is vertex-only it's possible we can do something like just give the GPU unmapped memory for the TVB head pointers and TPC (the TPC size can be set to zero, it's only used by the firmware to know how much to clear) and then pass dummy parameters to the fragment stage so it doesn't try to read them. I can investigate if Metal can do this or try to experimentally figure it out myself, if it's important.

alyssarosenzweig commented 19 hours ago

@alyssarosenzweig Don't know if that's the problem at this point, but do we want to investigate if we can have a special path for vertex-only passes without a giant tile buffer? If Mesa can know this is vertex-only it's possible we can do something like just give the GPU unmapped memory for the TVB head pointers and TPC (the TPC size can be set to zero, it's only used by the firmware to know how much to clear) and then pass dummy parameters to the fragment stage so it doesn't try to read them. I can investigate if Metal can do this or try to experimentally figure it out myself, if it's important.

I mean, maybe? Do we have a reason to think this is vertex only? The spicy case with Dx12 (and only DX12, dxvk doesn't do this AFAIK) is with no attachments, but you still have to run the fragment shaders and rasterize and everything since the FS will have global memory writes

asahilina commented 5 hours ago

Sorry, I was confused ^^