deuso / latx-build

Creative Commons Zero v1.0 Universal
51 stars 3 forks source link

1.5.2-rc1 AOSC wine-9.9 使用wined3d运行游戏会导致 amdgpu 驱动重置 #14

Open phorcys opened 3 months ago

phorcys commented 3 months ago

系统 AOSC latx 1.5.2~rc1+emukit20240529 host mesa 24.0.7 x86 runtime mesa 24.0.7

运行游戏 NieR Replicant ver 1 22474487139

$ wine NieR\ Replicant\ ver.1.22474487139.exe

游戏正常启动,但启动后会导致 amdgpu驱动不断重置

[ 1351.527712] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
[ 1351.969906] amdgpu 0000:05:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:5 pasid:32785)
[ 1351.978822] amdgpu 0000:05:00.0: amdgpu:  in process NieR Replicant  pid 5312 thread NieR Repli:cs0 pid 5330
[ 1351.988592] amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000000000000000 from client 0x1b (UTCL2)
[ 1351.998880] amdgpu 0000:05:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00501431
[ 1352.006489] amdgpu 0000:05:00.0: amdgpu:      Faulty UTCL2 client ID: SQC (data) (0xa)
[ 1352.014097] amdgpu 0000:05:00.0: amdgpu:      MORE_FAULTS: 0x1
[ 1352.019632] amdgpu 0000:05:00.0: amdgpu:      WALKER_ERROR: 0x0
[ 1352.025251] amdgpu 0000:05:00.0: amdgpu:      PERMISSION_FAULTS: 0x3
[ 1352.031303] amdgpu 0000:05:00.0: amdgpu:      MAPPING_ERROR: 0x0
[ 1352.037010] amdgpu 0000:05:00.0: amdgpu:      RW: 0x0
[ 1352.041771] amdgpu 0000:05:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:5 pasid:32785)
[ 1352.050676] amdgpu 0000:05:00.0: amdgpu:  in process NieR Replicant  pid 5312 thread NieR Repli:cs0 pid 5330
[ 1352.060444] amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000000000000000 from client 0x1b (UTCL2)
[ 1352.070731] amdgpu 0000:05:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 1352.078339] amdgpu 0000:05:00.0: amdgpu:      Faulty UTCL2 client ID: CB/DB (0x0)
[ 1352.085515] amdgpu 0000:05:00.0: amdgpu:      MORE_FAULTS: 0x0
[ 1352.091049] amdgpu 0000:05:00.0: amdgpu:      WALKER_ERROR: 0x0
[ 1352.096669] amdgpu 0000:05:00.0: amdgpu:      PERMISSION_FAULTS: 0x0
[ 1352.102722] amdgpu 0000:05:00.0: amdgpu:      MAPPING_ERROR: 0x0
[ 1352.108429] amdgpu 0000:05:00.0: amdgpu:      RW: 0x0
[ 1352.113189] amdgpu 0000:05:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:5 pasid:32785)
[ 1352.122093] amdgpu 0000:05:00.0: amdgpu:  in process NieR Replicant  pid 5312 thread NieR Repli:cs0 pid 5330
[ 1352.131860] amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000000000000000 from client 0x1b (UTCL2)
[ 1352.142147] amdgpu 0000:05:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 1352.149755] amdgpu 0000:05:00.0: amdgpu:      Faulty UTCL2 client ID: CB/DB (0x0)
[ 1352.156930] amdgpu 0000:05:00.0: amdgpu:      MORE_FAULTS: 0x0
[ 1352.162464] amdgpu 0000:05:00.0: amdgpu:      WALKER_ERROR: 0x0
[ 1352.168084] amdgpu 0000:05:00.0: amdgpu:      PERMISSION_FAULTS: 0x0
[ 1352.174136] amdgpu 0000:05:00.0: amdgpu:      MAPPING_ERROR: 0x0
[ 1352.179842] amdgpu 0000:05:00.0: amdgpu:      RW: 0x0
[ 1352.184602] amdgpu 0000:05:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:5 pasid:32785)
[ 1352.193506] amdgpu 0000:05:00.0: amdgpu:  in process NieR Replicant  pid 5312 thread NieR Repli:cs0 pid 5330
[ 1352.203274] amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000000000000000 from client 0x1b (UTCL2)
[ 1352.213560] amdgpu 0000:05:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 1352.221167] amdgpu 0000:05:00.0: amdgpu:      Faulty UTCL2 client ID: CB/DB (0x0)
[ 1352.228343] amdgpu 0000:05:00.0: amdgpu:      MORE_FAULTS: 0x0
[ 1352.233876] amdgpu 0000:05:00.0: amdgpu:      WALKER_ERROR: 0x0
[ 1352.239496] amdgpu 0000:05:00.0: amdgpu:      PERMISSION_FAULTS: 0x0
[ 1352.245549] amdgpu 0000:05:00.0: amdgpu:      MAPPING_ERROR: 0x0
[ 1352.251256] amdgpu 0000:05:00.0: amdgpu:      RW: 0x0
[ 1362.279306] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
phorcys commented 3 months ago

继续测试了下, 似乎这个bug仅在 尼尔·复制体 这个游戏会出现 在 aosc下 xhci 驱动reset,amdgpu reset 在 arch下(16k内核) , amdgpu不 reset,只是xhci reset

检查 xhci reset 对应的设备,发现是一个 连在kvm切换器(usb3.0 hub)上的键盘和鼠标 在arch下 断开 kvm切换器的usb hub。。。故障消失

在aosc下故障依旧

猜测是因为 16kpage 内核下,latx-1.5.2-rc2 运行特定游戏时,mesa gallium对 drm/amdgpu的ioctl,会导致内核错误

moon-watch commented 1 month ago

DXVK现在可以用了