Benzhaomin / corsairpsu

hwmon Linux Kernel driver for the Corsair RMi and HXi series of PSUs
GNU General Public License v2.0
29 stars 6 forks source link

Bogus readings on concurrent access to hwmon sysfs files (Corsair HX1000i) #1

Closed intelfx closed 4 years ago

intelfx commented 4 years ago

First, thank you for this driver! There's nothing more satisfying than buying a piece of known Windows-only hardware and discovering that somebody has already REd and wrote a driver for that :)

The problem I've discovered is that when you try to read the hwmon sysfs files concurrently from more than one place, they will return completely bogus readings:

# sensors
corsairpsu-hid-3-1
Adapter: HID adapter
voltage supply:   3.33 V
voltage 12v:     48.50 V
voltage 5v:     134.00 V
voltage 3.3v:   106.00 V
fan rpm:           0 RPM
temp1:           +27.0°C
temp2:            +2.0°C
power total:      2.50 W
power 12v:        8.75 W
power 5v:         5.44 W
power 3.3v:     875.00 mW
current 12v:      0.00 A
current 5v:     1000.00 mA
current 3.3v:     2.00 A
<...>

Reproducer is trivial: run watch -n1 sensors and watch -n0.1 sensors in two terminals and look at the first one.

Just in case this is somehow hardware dependent:

I'll be happy to provide any other information you need.

Benzhaomin commented 4 years ago

Thanks for your detailed report. There's a race condition when talking to the PSU through USB. We don't want to block waiting for a mutex so the best solution is to try to lock and return N/A if a USB command is already in flight.

intelfx commented 4 years ago

I've been meaning to write this followup, but something went wrong...

Thanks for the fix, it works. There's a thing though:

We don't want to block waiting for a mutex so the best solution is to try to lock and return N/A if a USB command is already in flight.

There is an unfortunate consequence of this approach. A commonly used s-tui does not like (==crashes) when sensors disappear from under its nose or suddenly return I/O errors, and it does not seem to be possible to prevent it from accessing a sensor even if you hide it in the UI.

I understand this is probably something to be properly fixed in s-tui, but, seeing as any other hwmon drivers apparently don't have such issues, how hard would it be to cache the last read values or something like that to avoid returning EINVALs? Or, rather, why is it that we don't want to block on a mutex inside the sysfs read?

Benzhaomin commented 4 years ago

how hard would it be to cache the last read values or something like that to avoid returning EINVALs?

I'll have a look, should be doable.

Or, rather, why is it that we don't want to block on a mutex inside the sysfs read?

With a short enough refresh rate or lots of concurrent reads we might end-up with calls pilling up on that lock. USB latency is pretty high in that context. I think I first tried that approach and results were all kinds of wonky.