Closed lionsnob closed 2 years ago
Hi @lionsnob, thanks for reporting. Can you please share your card's video BIOS?
Hi @lionsnob, thanks for reporting. Can you please share your card's video BIOS?
Sure - is there a guide on how to do so?
Likely that you can dump it using one of the methods described here or find it the one equivalent to your card at https://www.techpowerup.com/vgabios/
Actually, just attaching a contents of /sys/class/drm/card0/device/pp_table
(assuming your 6400 is card0
) would do.
Thanks to a kind donor, I got the RX6400 to test the upp against it. The latest release 0.1.7 fixes the table parsing for Navi 24 aka Biege Goby. However, committing the pp_table
to kernel (5.19.4-arch1-1) crashes the driver hard, even unmodified pp_table
crashes the driver. GPU reset also fails consistently:
[ 8097.633086] amdgpu 0000:0c:00.0: amdgpu: smu driver if version = 0x0000000d, smu fw if version = 0x0000000f, smu fw program = 0, version = 0x00491b00 (73.27.0)
[ 8097.633092] amdgpu 0000:0c:00.0: amdgpu: SMU driver if version not matched
[ 8097.633128] amdgpu 0000:0c:00.0: amdgpu: use vbios provided pptable
[ 8102.480286] amdgpu 0000:0c:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000006 SMN_C2PMSG_82:0x00000000
[ 8102.480291] amdgpu 0000:0c:00.0: amdgpu: Failed to enable requested dpm features!
[ 8102.480292] amdgpu 0000:0c:00.0: amdgpu: Failed to setup smc hw!
[ 8102.480293] amdgpu 0000:0c:00.0: amdgpu: smu reset failed, ret = -62
[ 8102.494799] audit: type=1106 audit(1661617692.249:184): pid=22805 uid=1000 auid=1000 ses=4 msg='op=PAM:session_close grantors=pam_systemd_home,pam_limits,pam_unix,pam_permit acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[ 8102.494839] audit: type=1104 audit(1661617692.249:185): pid=22805 uid=1000 auid=1000 ses=4 msg='op=PAM:setcred grantors=pam_faillock,pam_permit,pam_faillock acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[ 8102.717511] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
[ 8107.847760] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=336029, emitted seq=336030
[ 8107.848133] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process gnome-shell pid 1202 thread gnome-shel:cs0 pid 1214
[ 8107.848483] amdgpu 0000:0c:00.0: amdgpu: GPU reset begin!
[ 8107.848625] ------------[ cut here ]------------
[ 8107.848626] amdgpu 0000:0c:00.0: SMU uninitialized but power ungate requested for 6!
[ 8107.848643] WARNING: CPU: 11 PID: 20553 at drivers/gpu/drm/amd/amdgpu/../pm/swsmu/amdgpu_smu.c:227 smu_dpm_set_power_gate+0x193/0x1b0 [amdgpu]
[ 8107.849001] Modules linked in: hid_logitech_hidpp joydev mousedev hid_logitech_dj usbhid btusb btrtl btbcm btintel btmtk intel_rapl_msr intel_rapl_common bluetooth snd_hda_codec_realtek vfat ecdh_generic fat rfkill snd_hda_codec_generic edac_mce_amd ledtrig_audio snd_hda_codec_hdmi amdgpu kvm_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm gpu_sched irqbypass snd_hda_core crct10dif_pclmul drm_ttm_helper snd_hwdep crc32_pclmul ttm ghash_clmulni_intel snd_pcm aesni_intel snd_timer wmi_bmof crypto_simd igb drm_display_helper cryptd snd rapl cec tpm_crb soundcore pcspkr sp5100_tco ccp dca k10temp i2c_piix4 tpm_tis tpm_tis_core wmi tpm gpio_amdpt gpio_generic mac_hid rng_core acpi_cpufreq pkcs8_key_parser dm_multipath dm_mod ledtrig_timer sg crypto_user fuse bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 nvme crc32c_intel xhci_pci nvme_core xhci_pci_renesas raid1 md_mod
[ 8107.849043] Unloaded tainted modules: amd64_edac():1 amd64_edac():1 amd64_edac():1 amd64_edac():1 amd64_edac():1 pcc_cpufreq():1 amd64_edac():1 pcc_cpufreq():1 amd64_edac():1 pcc_cpufreq():1 amd64_edac():1 pcc_cpufreq():1 amd64_edac():1 pcc_cpufreq():1 pcc_cpufreq():1 amd64_edac():1 pcc_cpufreq():1 amd64_edac():1 amd64_edac():1 pcc_cpufreq():1 pcc_cpufreq():1 amd64_edac():1 pcc_cpufreq():1 amd64_edac():1 pcc_cpufreq():1 amd64_edac():1 pcc_cpufreq():1 amd64_edac():1 fjes():1 pcc_cpufreq():1 fjes():1 pcc_cpufreq():1 fjes():1 pcc_cpufreq():1 fjes():1 pcc_cpufreq():1
[ 8107.849065] CPU: 11 PID: 20553 Comm: kworker/u64:3 Not tainted 5.19.4-arch1-1 #1 a79159a06114f186ee90746caae91e23b0bedae9
[ 8107.849068] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450 Gaming-ITX/ac, BIOS P4.80 03/01/2022
[ 8107.849069] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
[ 8107.849072] RIP: 0010:smu_dpm_set_power_gate+0x193/0x1b0 [amdgpu]
[ 8107.849248] Code: 85 ed 75 03 48 8b 2f 89 74 24 04 e8 d7 51 48 e7 44 8b 44 24 04 48 89 d9 48 89 ea 48 89 c6 48 c7 c7 18 75 53 c1 e8 fc c6 84 e7 <0f> 0b b8 a1 ff ff ff e9 dd fe ff ff e9 60 45 23 00 e9 5b 45 23 00
[ 8107.849249] RSP: 0018:ffffa64105c1bbf8 EFLAGS: 00010286
[ 8107.849251] RAX: 0000000000000000 RBX: ffffffffc157e9bc RCX: 0000000000000027
[ 8107.849252] RDX: ffff9ae9dece1668 RSI: 0000000000000001 RDI: ffff9ae9dece1660
[ 8107.849252] RBP: ffff9ae2c168a360 R08: 0000000000000000 R09: ffffa64105c1ba80
[ 8107.849253] R10: 0000000000000003 R11: ffff9ae9ff326c28 R12: 0000000000000000
[ 8107.849254] R13: ffff9ae2dd967a48 R14: ffff9ae2dd968bf8 R15: 0000000000000001
[ 8107.849255] FS: 0000000000000000(0000) GS:ffff9ae9decc0000(0000) knlGS:0000000000000000
[ 8107.849256] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8107.849256] CR2: 00007ff9f4005388 CR3: 000000010dd2a000 CR4: 0000000000350ee0
[ 8107.849258] Call Trace:
[ 8107.849259] <TASK>
[ 8107.849261] amdgpu_dpm_set_powergating_by_smu+0x88/0xf0 [amdgpu db0e1456434381ddf5da2ad47ae2a0e1881887c5]
[ 8107.849449] amdgpu_gfx_off_ctrl+0xcc/0x120 [amdgpu db0e1456434381ddf5da2ad47ae2a0e1881887c5]
[ 8107.849617] gfx_v10_0_set_powergating_state+0x57/0x210 [amdgpu db0e1456434381ddf5da2ad47ae2a0e1881887c5]
[ 8107.849781] amdgpu_device_set_pg_state+0x96/0xf0 [amdgpu db0e1456434381ddf5da2ad47ae2a0e1881887c5]
[ 8107.849937] amdgpu_device_ip_suspend_phase1+0x1a/0xc0 [amdgpu db0e1456434381ddf5da2ad47ae2a0e1881887c5]
[ 8107.850092] ? drm_sched_increase_karma_ext+0x8c/0xd0 [gpu_sched 5f4d7b5d46a10c192ea66effc4f4f5a90e0d418e]
[ 8107.850095] amdgpu_device_ip_suspend+0x1f/0x70 [amdgpu db0e1456434381ddf5da2ad47ae2a0e1881887c5]
[ 8107.850250] amdgpu_device_pre_asic_reset+0xc2/0x260 [amdgpu db0e1456434381ddf5da2ad47ae2a0e1881887c5]
[ 8107.850404] amdgpu_device_gpu_recover_imp.cold+0x445/0x8ea [amdgpu db0e1456434381ddf5da2ad47ae2a0e1881887c5]
[ 8107.850607] amdgpu_job_timedout+0x18f/0x1c0 [amdgpu db0e1456434381ddf5da2ad47ae2a0e1881887c5]
[ 8107.850789] ? try_to_wake_up+0x23e/0x550
[ 8107.850792] drm_sched_job_timedout+0x7a/0x110 [gpu_sched 5f4d7b5d46a10c192ea66effc4f4f5a90e0d418e]
[ 8107.850796] process_one_work+0x1c7/0x380
[ 8107.850799] worker_thread+0x51/0x390
[ 8107.850800] ? rescuer_thread+0x3b0/0x3b0
[ 8107.850801] kthread+0xde/0x110
[ 8107.850803] ? kthread_complete_and_exit+0x20/0x20
[ 8107.850804] ret_from_fork+0x22/0x30
[ 8107.850807] </TASK>
[ 8107.850808] ---[ end trace 0000000000000000 ]---
[ 8119.981710] amdgpu 0000:0c:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
[ 8119.981885] [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
[ 8120.245670] [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
[ 8125.103886] amdgpu 0000:0c:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000006 SMN_C2PMSG_82:0x00000000
[ 8125.103888] amdgpu 0000:0c:00.0: amdgpu: Failed to disable smu features.
[ 8125.103891] amdgpu 0000:0c:00.0: amdgpu: Fail to disable dpm features!
[ 8125.103892] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62
[ 8125.116303] [drm] free PSP TMR buffer
Receive the error message: Can not decode PowerPlay table version 19.0