Benzhaomin / corsairpsu

hwmon Linux Kernel driver for the Corsair RMi and HXi series of PSUs
GNU General Public License v2.0
29 stars 6 forks source link

All readings 0 after wake from suspend #2

Closed jgslade closed 3 years ago

jgslade commented 3 years ago

I have a hx750i PSU and am using your driver to monitor the power usage. When I wake my computer from suspend I get all 0 readings until I reload the driver.

corsairpsu-hid-3-3
Adapter: HID adapter
voltage supply:   0.00 V  
voltage 12v:      0.00 V  
voltage 5v:       0.00 V  
voltage 3.3v:     0.00 V  
fan rpm:           0 RPM
temp1:            +0.0°C  
temp2:            +0.0°C  
power total:      0.00 W  
power 12v:        0.00 W  
power 5v:         0.00 W  
power 3.3v:       0.00 W  
current 12v:      0.00 A  
current 5v:       0.00 A  
current 3.3v:     0.00 A 

Then after reloading the driver I get normal readings again

corsairpsu-hid-3-3
Adapter: HID adapter
voltage supply: 115.00 V  
voltage 12v:     12.06 V  
voltage 5v:       5.06 V  
voltage 3.3v:     3.34 V  
fan rpm:           0 RPM
temp1:           +31.2°C  
temp2:           +25.0°C  
power total:     36.00 W  
power 12v:       20.00 W  
power 5v:        10.50 W  
power 3.3v:       7.00 W  
current 12v:      1.75 A  
current 5v:       2.19 A  
current 3.3v:     2.19 A 

I am running on Manjaro with the AUR package: https://duckduckgo.com/?t=ffab&q=corsairpsu+0+readings+after+wake+from+suspend&ia=web

Benzhaomin commented 3 years ago

Hi, it might be an easy fix but I can't find the time to look into it right now. I'll get to it at some point though.

If anybody feels like having a crack at it that'd be great too.

JackDoan commented 3 years ago

@jgslade, would you mind testing my branch in #4 and seeing if it fixes your issue?

jgslade commented 3 years ago

I'll give it a try tomorrow when I can get to the computer.

jgslade commented 3 years ago

@JackDoan I rebuilt using your repo and I am still showing zeros after wake from sleep.

corsairpsu-hid-3-3
Adapter: HID adapter
voltage supply:   0.00 V  
voltage 12v:      0.00 V  
voltage 5v:       0.00 V  
voltage 3.3v:     0.00 V  
fan rpm:           0 RPM
temp1:            +0.0°C  
temp2:            +0.0°C  
power total:      0.00 W  
power 12v:        0.00 W  
power 5v:         0.00 W  
power 3.3v:       0.00 W  
current 12v:      0.00 A  
current 5v:       0.00 A  
current 3.3v:     0.00 A 

Checking dmesg these are the only messages I get for the corsairpsu driver which looks like it's just from the initial boot.

[    6.794665] corsairpsu 0003:1B1C:1C05.0003: hidraw2: USB HID v1.11 Device [                                                ] on usb-0000:01:00.0-13/input0
[    6.800786] corsairpsu driver ready for HX750i, CORSAIR, HX750i
JackDoan commented 3 years ago

Most untriumphant.

Could I trouble you to run this python script while sensors is reporting all zeros? The output will help me figure out what to trigger a re-handshake on.

#!/usr/bin/env python3
# greetz to Benzhaomin's pyrmi, which I abused to create this
import usb
import logging
import sys

def is_corsair_rmi_hxi_psu(device):
    return device.idVendor == 0x1b1c and device.idProduct in [
        0x1c0a,  # RM650i
        0x1c0b,  # RM750i
        0x1c0c,  # RM850i
        0x1c0d,  # RM1000i
        0x1c04,  # HX650i
        0x1c05,  # HX750i
        0x1c06,  # HX850i
        0x1c07,  # HX1000i
        0x1c08,  # HX1200i
    ]

dev = usb.core.find(custom_match=is_corsair_rmi_hxi_psu)
if dev is None:
    raise ValueError('No Corsair RMi/HXi Series PSU found')

# grab the device from the kernel's claws
ifaceid = 0
if dev.is_kernel_driver_active(ifaceid):
    dev.detach_kernel_driver(ifaceid)
    usb.util.claim_interface(dev, ifaceid)

try:
    cfg = dev.get_active_configuration()
    (reader, writer) = cfg[(0,0)].endpoints()

    def write(data):
        padding = [0x0]*(64 - len(data))
        writer.write(data + padding, timeout=100)

    def read():
        data = reader.read(64, timeout=100)
        print(f'hdr {data[0]:02x} {data[1]:02x}')
        return bytearray(data)[2:]

    print('skipping hello')
    cmds = [[0x03, 0x8d], [0x03, 0x88], [0x03, 0x8c]]
    for cmd in cmds:
        print(f'raw cmd {cmd[0]:02x} {cmd[1]:02x}')
        # send user-provided length+opcode
        write([b for b in cmd])

        # get data back and print in it various encoding
        data = read()
        print('raw', data)
        try:
            print('dec', int.from_bytes(data, byteorder='little'))
        except Exception as e:
            print('dec failed', e)

        try:
            tmp = int.from_bytes(data, byteorder='little')
            exp = tmp >> 11
            fra = tmp & 0x7ff
            if fra > 1023:
                fra = fra - 2048
            if exp > 15:
                exp = exp - 32
            if exp > 15:
                raise ValueError('big number')
            print('lin', fra * 2**exp)
        except Exception as e:
            print('lin failed', e)
finally:
    usb.util.release_interface(dev, ifaceid)
    dev.attach_kernel_driver(ifaceid)

I'd do it myself, but I can't seem to get my supply to exhibit the same behavior after sleeping. Perhaps my motherboard doesn't turn the USB ports off? I was also not able to reproduce by re-plugging the USB port.

I'm not trying to call you crazy, I'm just trying to narrow down what causes this apparent reset of the PSU's communication interface. I know it also happens on a cold boot. When you say 'sleep', are you referring to suspend-to-ram (what Windows calls 'standby') or suspend-to-disk (what Windows calls 'hibernate')?

jgslade commented 3 years ago

Suspend to ram, I haven't tried suspend to disk, it doesn't really save any time over just shutting down since KDE saves the state of my system on shutdown anyway.

This is the output from that script

skipping hello
raw cmd 03 8d
hdr 03 fe
raw bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
dec 0
lin 0
raw cmd 03 88
hdr 03 fe
raw bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
dec 0
lin 0
raw cmd 03 8c
hdr 03 fe
raw bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
dec 0
lin 0
JackDoan commented 3 years ago

Caught the gremlin. I was comparing the rx'd buffer and the opcode without masking the buffer. I think the fix in PR #4 should work now. Would you mind giving it a shot?

jgslade commented 3 years ago

Still showing all zeros. Just to make sure I am getting the right build, I have been using this repo to clone https://github.com/JackDoan/corsairpsu.git

JackDoan commented 3 years ago

That's the right repo. Can you confirm which branch you're building from? The fix is on redo-handshake-on-err.

jgslade commented 3 years ago

I just did a clone from that repository and built, I am a git novice so I don't know how to switch branches, or that I would need to. Let me see what I can figure out.

Okay I ran git branch and was on master so I ran git checkout redo-handshake-on-err then another git pull and was up to date. I then uninstalled corsairpsu and rebuilt it, still getting zeros after sleep.

JackDoan commented 3 years ago

Could you try this procedure?

  1. git clone https://github.com/JackDoan/corsairpsu.git
  2. git fetch <-- this copies remote branches to your machine. Without it, you'll make a new branch that happens to have the same name as the remote one, but it'll be different. :(
  3. git checkout redo-handshake-on-err

Then to make sure it worked,

  1. Line 165 of your corsairpsu.c should be if( (data->buf[1] & 0xff) != (opcode & 0xff)) {
  2. Just to make sure DKMS didn't get all screwy on you, try rebuilding & testing with make && sudo rmmod corsairpsu; sudo insmod corsairpsu.ko
jgslade commented 3 years ago

That appears to have done the trick. I then tried to build the DKMS version and it was all zeros after sleep.

JackDoan commented 3 years ago

So it worked when you used make && sudo rmmod corsairpsu; sudo insmod corsairpsu.ko, but didn't when you used dkms?

make && sudo make dkms-uninstall && sudo make dkms-install && sudo rmmod corsairpsu && sudo modprobe corsairpsu should probably do the trick I think.

jgslade commented 3 years ago

That is correct.

make && sudo make dkms-uninstall && sudo make dkms-install && sudo rmmod corsairpsu && sudo modprobe corsairpsu did not work however. It errors with sudo make dkms-uninstall and if I then run sudo make dkms-install && sudo rmmod corsairpsu && sudo modprobe corsairpsu I get the zeros again.

 make && sudo make dkms-uninstall && sudo make dkms-install && sudo rmmod corsairpsu && sudo modprobe corsairpsu
make[1]: Entering directory '/usr/lib/modules/5.9.11-3-MANJARO/build'
make[1]: Leaving directory '/usr/lib/modules/5.9.11-3-MANJARO/build'
dkms remove corsairpsu/0.0.1 --all
Error! The module/version combo: corsairpsu-0.0.1
is not located in the DKMS tree.
make: *** [Makefile:47: dkms-uninstall] Error 3
sudo make dkms-install && sudo rmmod corsairpsu && sudo modprobe corsairpsu
mkdir /usr/src/corsairpsu-0.0.1
cp /home/sam/tmp/corsairpsu/tmp/corsairpsu/dkms.conf /usr/src/corsairpsu-0.0.1
cp /home/sam/tmp/corsairpsu/tmp/corsairpsu/Makefile /usr/src/corsairpsu-0.0.1
cp /home/sam/tmp/corsairpsu/tmp/corsairpsu/corsairpsu.c /usr/src/corsairpsu-0.0.1
sed -e "s/@CFLGS@//" \
    -e "s/@VERSION@/0.0.1/" \
    -i /usr/src/corsairpsu-0.0.1/dkms.conf
dkms add corsairpsu/0.0.1

Creating symlink /var/lib/dkms/corsairpsu/0.0.1/source ->
                 /usr/src/corsairpsu-0.0.1

DKMS: add completed.
dkms build corsairpsu/0.0.1

Kernel preparation unnecessary for this kernel.  Skipping...

Building module:
cleaning build area...
make -j16 KERNELRELEASE=5.9.11-3-MANJARO TARGET=5.9.11-3-MANJARO CFLAGS_MODULE+=...
cleaning build area...
Kernel cleanup unnecessary for this kernel.  Skipping...

DKMS: build completed.
dkms install corsairpsu/0.0.1

corsairpsu.ko.xz:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /usr/lib/modules/5.9.11-3-MANJARO/kernel/drivers/hwmon/corsairpsu/

depmod.....

DKMS: install completed.
rmmod: ERROR: Module corsairpsu is not currently loaded
sudo modprobe corsairpsu
JackDoan commented 3 years ago

Do you happen to have the corsairpsu-dkms-git package installed from the AUR? I'd expect to see evidence of it in your output of it above (and I don't), but that's the next most likely thing that could be messing things up.

Does your running kernel (as reported by uname -r) match your installed kernel (pacman -Q linux)?

Finally, it's possible that you have another corsairpsu.ko module somewhere in your system that modprobe is picking up first. find /lib64/modules -name corsairpsu.ko* should probably list everything that modprobe could find.

jgslade commented 3 years ago

I uninstalled the one from AUR before switching branches.

uname -r 
5.9.11-3-MANJARO
pacman -Q linux
linux58 5.8.18-1
pacman -Q linux59
linux59 5.9.11-3

The find command doesn't return anything.

JackDoan commented 3 years ago

Idk mate, I use arch, not manjaro. If it works when you manually insmod & run the correct version, that probably means it's some kind of wonky configuration issue unfortunately.

You could try adding a printk in the 'good' version, so you could potentially tell it apart from the 'bad' version.

Benzhaomin commented 3 years ago

Fixed in master now, thanks to both of you guys.