electrified / asus-wmi-sensors

Linux HWMON (lmsensors) sensors driver for various ASUS Ryzen and Threadripper motherboards
GNU General Public License v2.0
251 stars 30 forks source link

ROG STRIX X470-F + ArchLinux = system hangs #45

Closed laenco closed 4 years ago

laenco commented 4 years ago

Hello.

I've got successfully compiled module for a series of CK-patched kernels - from 5.3.smth to current

modinfo asus-wmi-sensors
filename:       /lib/modules/5.4.2-1-ck/extramodules/kernel/drivers/hwmon/asus-wmi-sensors.ko.xz
version:        3
license:        GPL
description:    Asus WMI Sensors Driver
author:         Ed Brindley <kernel@maidavale.org>
srcversion:     19FC587EDDC1940E9AD8A4B
depends:        wmi
retpoline:      Y
name:           asus_wmi_sensors
vermagic:       5.4.2-1-ck SMP preempt mod_unload

But I'm afraid of loading it - system fully hangs after some hours - from couple to 5-6, depends on module usage. Had to reset by button. Didn't find any useful info in logs. Have anyone ideas what could be source of the problem?

System:    Host: <filter> Kernel: 5.4.2-1-ck x86_64 bits: 64 compiler: gcc v: 9.2.0 Console: tty 4 Distro: Arch Linux
Machine:   Type: Desktop Mobo: ASUSTeK model: ROG STRIX X470-F GAMING v: Rev X.0x serial: <filter> UEFI: American Megatrends
v: 5406 date: 11/13/2019
CPU:       Topology: 8-Core model: AMD Ryzen 7 2700X bits: 64 type: MT MCP arch: Zen+ rev: 2 L2 cache: 4096 KiB

Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590]
vendor: Micro-Star MSI driver: amdgpu v: kernel bus ID: 09:00.0
Device-2: NVIDIA GP104 [GeForce GTX 1080] driver: vfio-pci v: 0.2 bus ID: 0a:00.0
Display: server: X.Org 1.20.6 driver: amdgpu resolution: 1920x1080~60Hz
OpenGL: renderer: Radeon RX 570 Series (POLARIS10 DRM 3.35.0 5.4.2-1-ck LLVM 9.0.0) v: 4.5 Mesa 19.2.7
direct render: Yes
Audio:     Device-1: AMD Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] vendor: Micro-Star MSI driver: snd_hda_intel
v: kernel bus ID: 09:00.1
Device-2: NVIDIA GP104 High Definition Audio driver: vfio-pci v: 0.2 bus ID: 0a:00.1
Device-3: Advanced Micro Devices [AMD] Family 17h HD Audio vendor: ASUSTeK driver: snd_hda_intel v: kernel
bus ID: 0c:00.3
Device-4: Microdia Camera type: USB driver: snd-usb-audio,uvcvideo bus ID: 5-1.4:6
Device-5: JMTek LLC. USB PnP Audio Device type: USB driver: hid-generic,snd-usb-audio,usbhid bus ID: 5-2:3
Sound Server: ALSA v: k5.4.2-1-ck

Network:   Device-1: Intel I211 Gigabit Network vendor: ASUSTeK driver: igb v: 5.6.0-k port: e000 bus ID: 07:00.0
IF: enp7s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
IF-ID-1: br0 state: up speed: N/A duplex: N/A mac: <filter>

Drives:    Local Storage: total: 5.55 TiB used: 660.70 GiB (11.6%)
ID-1: /dev/nvme0n1 vendor: Ricoh model: R5MP960G8 size: 894.25 GiB temp: 36 C
ID-2: /dev/sda vendor: A-Data model: SU900 size: 953.87 GiB temp: 26 C
ID-3: /dev/sdb vendor: Kingston model: SV300S37A120G size: 111.79 GiB temp: 24 C
ID-4: /dev/sdc vendor: Western Digital model: WD10EZEX-08WN4A0 size: 931.51 GiB temp: 30 C
ID-5: /dev/sdd vendor: Western Digital model: WD10EZEX-08WN4A0 size: 931.51 GiB temp: 32 C
ID-6: /dev/sde vendor: HGST (Hitachi) model: HTE721010A9E630 size: 931.51 GiB temp: 25 C
ID-7: /dev/sdf vendor: HGST (Hitachi) model: HTS721010A9E630 size: 931.51 GiB temp: 27 C

RAID:      Device-1: md0 type: mdraid status: active Components: online: sde1~c1 sdf1~c0 sdd1~c3 sdc1~c2
electrified commented 4 years ago

Hi there.

What monitoring software do you have, and what polling interval are you using? I will see if I can recreate the issue.

My setup is fairly similar to yours, minus the 1080 and the CK kernel.

System:    Host: zoomer Kernel: 5.4.2-arch1-1 x86_64 bits: 64 compiler: gcc v: 9.2.0 Desktop: KDE Plasma 5.17.4 
           Distro: Arch Linux 
Machine:   Type: Desktop Mobo: ASUSTeK model: ROG CROSSHAIR VII HERO (WI-FI) v: Rev 1.xx serial: <filter> 
           UEFI: American Megatrends v: 2606 date: 08/08/2019 
CPU:       Topology: 8-Core model: AMD Ryzen 7 2700X bits: 64 type: MT MCP arch: Zen+ rev: 2 L2 cache: 4096 KiB 
           flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 118435 
           Speed: 3443 MHz min/max: 2200/3700 MHz Core speeds (MHz): 1: 3484 2: 4021 3: 4150 4: 2768 5: 2137 6: 4151 7: 4158 
           8: 4139 9: 4150 10: 2778 11: 2284 12: 4149 13: 4152 14: 4163 15: 4142 16: 4152 
Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] vendor: XFX Pine 
           driver: amdgpu v: kernel bus ID: 0a:00.0 
           Display: x11 server: X.Org 1.20.6 driver: vesa unloaded: modesetting resolution: 3840x2160~60Hz 
           OpenGL: renderer: AMD Radeon RX 480 Graphics (POLARIS10 DRM 3.35.0 5.4.2-arch1-1 LLVM 9.0.0) v: 4.5 Mesa 19.2.7 
           direct render: Yes 
Audio:     Device-1: AMD Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] vendor: XFX Pine driver: snd_hda_intel 
           v: kernel bus ID: 0a:00.1 
           Device-2: Advanced Micro Devices [AMD] Family 17h HD Audio vendor: ASUSTeK driver: snd_hda_intel v: kernel 
           bus ID: 0c:00.3 
           Device-3: Focusrite-Novation Scarlett 18i8 2nd Gen type: USB driver: snd-usb-audio bus ID: 5-4:3 
           Sound Server: ALSA v: k5.4.2-arch1-1 
Network:   Device-1: Intel I211 Gigabit Network vendor: ASUSTeK driver: igb v: 5.6.0-k port: c000 bus ID: 06:00.0 
           IF: eth0 state: up speed: 100 Mbps duplex: full mac: <filter> 
           Device-2: Realtek RTL8822BE 802.11a/b/g/n/ac WiFi adapter vendor: ASUSTeK driver: rtw_pci v: N/A port: b000 
           bus ID: 07:00.0 
           IF: wlan0 state: up mac: <filter> 
           Device-3: Aquantia AQC107 NBase-T/IEEE 802.3bz Ethernet [AQtion] vendor: ASUSTeK driver: atlantic 
           v: 5.4.2-arch1-1-kern port: b000 bus ID: 08:00.0 
           IF: eth1 state: down mac: <filter> 
           IF-ID-1: br-2cc486d9af16 state: down mac: <filter> 
           IF-ID-2: br-427608f70987 state: down mac: <filter> 
           IF-ID-3: br-eccb7555eb06 state: down mac: <filter> 
           IF-ID-4: docker0 state: down mac: <filter> 
           IF-ID-5: virbr0 state: down mac: <filter> 
           IF-ID-6: virbr0-nic state: down mac: <filter> 
Drives:    Local Storage: total: 4.11 TiB used: 1.17 TiB (28.5%) 
           ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 950 PRO 512GB size: 476.94 GiB 
           ID-2: /dev/sda vendor: Seagate model: ST3000DM007-1WY10G size: 2.73 TiB 
           ID-3: /dev/sdb vendor: Crucial model: CT1000MX500SSD1 size: 931.51 GiB 
           ID-4: /dev/sdc type: USB vendor: Generic model: Flash Disk size: 962.0 MiB 
Partition: ID-1: / size: 54.01 GiB used: 41.32 GiB (76.5%) fs: ext4 dev: /dev/dm-0 
           ID-2: /boot size: 983.7 MiB used: 145.0 MiB (14.7%) fs: ext4 dev: /dev/nvme0n1p5 
           ID-3: /home size: 225.96 GiB used: 137.54 GiB (60.9%) fs: ext4 dev: /dev/dm-1 
Sensors:   System Temperatures: cpu: 39.1 C mobo: 27.0 C gpu: amdgpu temp: 28 C 
           Fan Speeds (RPM): cpu: 520 case-1: 0 case-2: 821 case-3: 795 
           Voltages: 12v: 11.83 5v: N/A 3.3v: N/A vbat: 3.21 
Info:      Processes: 393 Uptime: 4h 44m Memory: 31.33 GiB used: 5.48 GiB (17.5%) Init: systemd Compilers: gcc: 9.2.0 
           clang: 9.0.0 Shell: zsh v: 5.7.1 inxi: 3.0.37
laenco commented 4 years ago

What monitoring software do you have, and what polling interval are you using?

just lm_sensors and 1s watch are enough to me - watch -n1 sensors

Last time I called sensors once or twice, without watch and got the longest stable work for 5-6 hours. I'll try again this this weekend and post results/logs here.

laenco commented 4 years ago

I've got kinda unexpected cpu upgrade this weekend. And it looks like Ryzen 3000 series has no such a problem - 2 days uptime with loaded module. Maybe, the problem was with exactly that cpu. I'm closing issue because unable to reproduce. Thanks)