Open mkeeter opened 11 months ago
I assume this is not obviously reproduceable. :\
Poking at the pmbus code now just to see if I find anything obvious.
If you recall, @mkeeter, which actual quantity was reading as NaN? The BMR491 has various voltage and current outputs, but does not have a power output per se (as in something measured in watts), so I want to make sure I'm looking at the right thing.
I realize it's been a minute.
Alright, caught up with Matt on this in chat.
I think this is a bug but not a Gimlet bug. Here's what it looks like is happening.
app/gimlet/base.toml
defines one power
sensor channel on the BMR491.pmbus
defines zero power
sensor channels.sensors
task allocates space for a power
sensor whose readings are never delivered.sensors
task uses f32::NAN
as an initialization value for the data_value
array, and leaks that at the API boundary.humility sensors
apparently doesn't work on a released image, so the engineers were using humility readvar
.Fortunately this means it's a lower severity issue. I'd argue it's still a bug, or possibly three smaller bugs in a trenchcoat. The things this makes me want to go investigate are:
humility sensors
not work on at least some released Gimlet images? (guess: it only tries udprpc)humility sensors
, for the record, uses the hiffy generic Idol call interface. So it should work on release image with a dongle attached, and is not expected to work on a release image over IP. To get sensor data over the management network, we'd either need to use a control plane oriented service or add to the gimlet-inspector
for debugging.
humility sensors
, for the record, uses the hiffy generic Idol call interface. So it should work on release image with a dongle attached, and is not expected to work on a release image over IP. To get sensor data over the management network, we'd either need to use a control plane oriented service or add to thegimlet-inspector
for debugging.
This specific problem was fixed with a net
-friendly backend in https://github.com/oxidecomputer/humility/pull/491, but the baseline issue of sensors not existing / being polled remains!
While investigating https://github.com/oxidecomputer/hardware-gimlet/issues/1988 , we noticed that the BMR491 power sensor (
112 0x70 power 4 F - 0x67 bmr491 V12_SYS_A2
) is readingNaN
. That seems weird!