marazmista / radeon-profile

Application to read current clocks of ATi Radeon cards (xf86-video-ati, xf86-video-amdgpu)
GNU General Public License v2.0
733 stars 75 forks source link

kernel NULL pointer dereference with 5700 XT, mesa-git, and linux-amd-staging-drm-next-git #147

Open dllu opened 5 years ago

dllu commented 5 years ago

What I did

What happens

[ 1897.505313] BUG: kernel NULL pointer dereference, address: 00000000000000a8
[ 1897.505316] #PF: supervisor read access in kernel mode
[ 1897.505318] #PF: error_code(0x0000) - not-present page
[ 1897.505319] PGD 0 P4D 0 
[ 1897.505322] Oops: 0000 [#4] PREEMPT SMP NOPTI
[ 1897.505325] CPU: 23 PID: 11708 Comm: radeon-profile Tainted: G      D    O      5.2.0-rc1-amd-staging-drm-next-git-b8cd95e15410+ #1
[ 1897.505326] Hardware name: System manufacturer System Product Name/PRIME X570-P, BIOS 0807 07/08/2019
[ 1897.505390] RIP: 0010:amdgpu_get_dpm_state+0x43/0xa0 [amdgpu]
[ 1897.505392] Code: 00 84 c0 74 1e 48 8b 83 20 65 00 00 48 8b 40 60 48 85 c0 74 0e 48 8d bb 08 65 00 00 e8 46 7c ca ea eb 1f 48 8b 83 f8 64 00 00 <48> 8b 80 a8 00 00 00 48 85 c0 74 46 48 8b bb f0 64 00 00 e8 25 7c
[ 1897.505393] RSP: 0018:ffffaa5560b07e28 EFLAGS: 00010246
[ 1897.505395] RAX: 0000000000000000 RBX: ffff936a48be0000 RCX: 0000000000000000
[ 1897.505396] RDX: ffff9368e104f000 RSI: ffffffffc2115360 RDI: ffff936a48be0000
[ 1897.505397] RBP: ffff9368e104f000 R08: ffff9368e104f001 R09: ffff936a56e470b0
[ 1897.505398] R10: 0000000000000000 R11: 0000000000000000 R12: ffff93699c5a7a00
[ 1897.505399] R13: ffff93699cc3a780 R14: ffff93699cc3a7a8 R15: ffff93699cc3a7c0
[ 1897.505401] FS:  00007efbfab74800(0000) GS:ffff936a5edc0000(0000) knlGS:0000000000000000
[ 1897.505402] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1897.505403] CR2: 00000000000000a8 CR3: 0000000f4fb2a000 CR4: 0000000000340ee0
[ 1897.505405] Call Trace:
[ 1897.505412]  dev_attr_show+0x19/0x40
[ 1897.505416]  sysfs_kf_seq_show+0x9b/0xf0
[ 1897.505419]  seq_read+0xcd/0x400
[ 1897.505422]  vfs_read+0x9d/0x150
[ 1897.505425]  ksys_read+0x5f/0xe0
[ 1897.505428]  do_syscall_64+0x4e/0x120
[ 1897.505432]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1897.505434] RIP: 0033:0x7efbfe04ca6c
[ 1897.505435] Code: ec 28 48 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 59 fc ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 30 44 89 c7 48 89 44 24 08 e8 8f fc ff ff 48
[ 1897.505437] RSP: 002b:00007ffbffffab80 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 1897.505438] RAX: ffffffffffffffda RBX: 0000565557d07660 RCX: 00007efbfe04ca6c
[ 1897.505439] RDX: 0000000000004000 RSI: 0000565557d12018 RDI: 000000000000000f
[ 1897.505440] RBP: 0000000000004000 R08: 0000000000000000 R09: 0000000000000018
[ 1897.505441] R10: 0000000000004000 R11: 0000000000000246 R12: 0000000000000000
[ 1897.505442] R13: 0000000000004000 R14: 000000000000000f R15: 0000565557d12018
[ 1897.505444] Modules linked in: edac_mce_amd xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat xt_addrtype iptable_filter bpfilter xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c br_netfilter bridge stp llc overlay cfg80211 8021q mrp nct6775(O) input_leds hwmon_vid mousedev sch_fq_codel nls_iso8859_1 nls_cp437 vfat fat amdgpu kvm irqbypass snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi amd_iommu_v2 gpu_sched i2c_algo_bit snd_hda_intel ttm snd_hda_codec drm_kms_helper eeepc_wmi asus_wmi snd_hda_core sparse_keymap rfkill crct10dif_pclmul crc32_pclmul snd_hwdep led_class ghash_clmulni_intel video evdev mac_hid wmi_bmof drm snd_pcm aesni_intel ccp aes_x86_64 crypto_simd cryptd agpgart r8169 snd_timer glue_helper syscopyarea rng_core sp5100_tco sysfillrect snd sysimgblt realtek fb_sys_fops i2c_piix4 zenpower(O) libphy pcspkr soundcore wmi button pcc_cpufreq acpi_cpufreq usbip_host usbip_core ip_tables x_tables ext4 crc32c_generic crc16
[ 1897.505480]  mbcache jbd2 hid_generic usbhid hid ahci xhci_pci crc32c_intel libahci xhci_hcd libata usbcore nvme scsi_mod usb_common nvme_core
[ 1897.505489] CR2: 00000000000000a8
[ 1897.505491] ---[ end trace fe0b58eeea8f0375 ]---
[ 1897.505550] RIP: 0010:amdgpu_get_dpm_state+0x43/0xa0 [amdgpu]
[ 1897.505552] Code: 00 84 c0 74 1e 48 8b 83 20 65 00 00 48 8b 40 60 48 85 c0 74 0e 48 8d bb 08 65 00 00 e8 46 7c ca ea eb 1f 48 8b 83 f8 64 00 00 <48> 8b 80 a8 00 00 00 48 85 c0 74 46 48 8b bb f0 64 00 00 e8 25 7c
[ 1897.505553] RSP: 0018:ffffaa5560b47e28 EFLAGS: 00010246
[ 1897.505554] RAX: 0000000000000000 RBX: ffff936a48be0000 RCX: 0000000000000000
[ 1897.505555] RDX: ffff936a4467f000 RSI: ffffffffc2115360 RDI: ffff936a48be0000
[ 1897.505556] RBP: ffff936a4467f000 R08: ffff936a4467f001 R09: ffff936a56e470b0
[ 1897.505557] R10: 0000000000000000 R11: 0000000000000000 R12: ffff936a0ae88700
[ 1897.505558] R13: ffff936a0af53180 R14: ffff936a0af531a8 R15: ffff936a0af531c0
[ 1897.505560] FS:  00007efbfab74800(0000) GS:ffff936a5edc0000(0000) knlGS:0000000000000000
[ 1897.505561] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1897.505562] CR2: 00000000000000a8 CR3: 0000000f4fb2a000 CR4: 0000000000340ee0
iBoMbY commented 4 years ago

Same.

System: Ubuntu 18.04.3 Driver: AMDGPU 19.30-855429 from AMD page radeon-profile: build from master

[ 1747.571494] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
[ 1747.571497] #PF error: [normal kernel read fault]
[ 1747.571499] PGD 0 P4D 0 
[ 1747.571502] Oops: 0000 [#4] SMP NOPTI
[ 1747.571505] CPU: 4 PID: 10587 Comm: radeon-profile Tainted: G      D    OE     5.0.0-29-generic #31~18.04.1-Ubuntu
[ 1747.571507] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F5a 09/09/2019
[ 1747.571567] RIP: 0010:amdgpu_get_dpm_state+0x4b/0xb0 [amdgpu]
[ 1747.571569] Code: 00 84 c0 74 1e 48 8b 83 90 57 00 00 48 8b 40 60 48 85 c0 74 0e 48 8d bb 78 57 00 00 e8 be 6e 41 fc eb 1f 48 8b 83 68 57 00 00 <48> 8b 80 a8 00 00 00 48 85 c0 74 48 48 8b bb 60 57 00 00 e8 9d 6e
[ 1747.571571] RSP: 0018:ffffb0ba836dbd80 EFLAGS: 00010246
[ 1747.571573] RAX: 0000000000000000 RBX: ffff9687b6080000 RCX: ffff9687b41fcd48
[ 1747.571575] RDX: 0000000000000017 RSI: ffffffffc0b01380 RDI: ffff9687b6080000
[ 1747.571576] RBP: ffffb0ba836dbd90 R08: ffff9687b9c980c0 R09: ffff96874b3800c0
[ 1747.571577] R10: ffff9686fe32d138 R11: 0000000000000000 R12: ffff96876aff3000
[ 1747.571578] R13: ffffb0ba836dbee8 R14: ffff9686fe32d100 R15: ffff9686fea50b00
[ 1747.571580] FS:  00007efe4aa50cc0(0000) GS:ffff9687be700000(0000) knlGS:0000000000000000
[ 1747.571582] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1747.571583] CR2: 00000000000000a8 CR3: 000000071680c000 CR4: 0000000000340ee0
[ 1747.571584] Call Trace:
[ 1747.571590]  dev_attr_show+0x21/0x50
[ 1747.571593]  sysfs_kf_seq_show+0x9f/0x130
[ 1747.571596]  kernfs_seq_show+0x27/0x30
[ 1747.571598]  seq_read+0xda/0x3f0
[ 1747.571601]  kernfs_fop_read+0x137/0x180
[ 1747.571604]  __vfs_read+0x1b/0x40
[ 1747.571606]  vfs_read+0x8e/0x130
[ 1747.571608]  ksys_read+0x5c/0xe0
[ 1747.571611]  __x64_sys_read+0x1a/0x20
[ 1747.571614]  do_syscall_64+0x5a/0x120
[ 1747.571617]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1747.571619] RIP: 0033:0x7efe48305384
[ 1747.571621] Code: 84 00 00 00 00 00 41 54 55 49 89 d4 53 48 89 f5 89 fb 48 83 ec 10 e8 8b fc ff ff 4c 89 e2 41 89 c0 48 89 ee 89 df 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 38 44 89 c7 48 89 44 24 08 e8 c7 fc ff ff 48
[ 1747.571622] RSP: 002b:00007ffe28a84990 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 1747.571624] RAX: ffffffffffffffda RBX: 0000000000000014 RCX: 00007efe48305384
[ 1747.571625] RDX: 0000000000004000 RSI: 000055aec3080098 RDI: 0000000000000014
[ 1747.571626] RBP: 000055aec3080098 R08: 0000000000000000 R09: 0000000000000000
[ 1747.571627] R10: 000055aec2796010 R11: 0000000000000246 R12: 0000000000004000
[ 1747.571628] R13: 0000000000004000 R14: 0000000000000014 R15: 000055aec3080098
[ 1747.571630] Modules linked in: edac_mce_amd kvm_amd kvm irqbypass amdgpu(OE) amd_iommu_v2 amdttm(OE) amd_sched(OE) snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_hda_codec crct10dif_pclmul snd_hda_core crc32_pclmul snd_hwdep ghash_clmulni_intel snd_pcm snd_seq_midi snd_seq_midi_event aesni_intel snd_rawmidi amdkcl(OE) aes_x86_64 drm_kms_helper crypto_simd cryptd snd_seq joydev glue_helper input_leds wmi_bmof snd_seq_device drm snd_timer snd fb_sys_fops syscopyarea sysfillrect sysimgblt ccp soundcore mac_hid sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_corsair hid_generic usbhid hid igb i2c_piix4 ahci nvme i2c_algo_bit libahci dca nvme_core wmi
[ 1747.571659] CR2: 00000000000000a8
[ 1747.571661] ---[ end trace 969fda2bf0f5e798 ]---
[ 1747.571718] RIP: 0010:amdgpu_get_dpm_state+0x4b/0xb0 [amdgpu]
[ 1747.571720] Code: 00 84 c0 74 1e 48 8b 83 90 57 00 00 48 8b 40 60 48 85 c0 74 0e 48 8d bb 78 57 00 00 e8 be 6e 41 fc eb 1f 48 8b 83 68 57 00 00 <48> 8b 80 a8 00 00 00 48 85 c0 74 48 48 8b bb 60 57 00 00 e8 9d 6e
[ 1747.571721] RSP: 0018:ffffb0ba8c4abd80 EFLAGS: 00010246
[ 1747.571723] RAX: 0000000000000000 RBX: ffff9687b6080000 RCX: ffff9687b41fcd48
[ 1747.571724] RDX: 0000000000000017 RSI: ffffffffc0b01380 RDI: ffff9687b6080000
[ 1747.571725] RBP: ffffb0ba8c4abd90 R08: ffff9687b9c980c0 R09: ffff9687ba12ccc0
[ 1747.571726] R10: ffff96876cb37138 R11: 0000000000000000 R12: ffff9686e4071000
[ 1747.571727] R13: ffffb0ba8c4abee8 R14: ffff96876cb37100 R15: ffff968722741000
[ 1747.571729] FS:  00007efe4aa50cc0(0000) GS:ffff9687be700000(0000) knlGS:0000000000000000
[ 1747.571730] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1747.571731] CR2: 00000000000000a8 CR3: 000000071680c000 CR4: 0000000000340ee0
Oxalin commented 4 years ago

The bug doesn't come from radeon-profile, but from one of the amdgpu driver's components. Is this still happening with latest libdrm and kernel? If so, please open a bug upstream.