Closed spinler closed 3 years ago
I think with all the one off things we need to do for the OCC, OCC sensors should just be in a standalone dbus-sensors application.
It would need to do things like:
Will just use this for OCC dbus-sensors functionality. If we end up sticking with phosphor-hwmon, we can drop it.
If someone picks this up, there is a bit more research I'll need to do first, so let me know.
@spinler IPS is interested to do this epic :)
I think the first thing is to switch to Entity-manager and Dbus-sensors repos.
But IPS does not have Rainier
machine, so how do we test it?
Also, Could you split this epic into multiple tasks? Thanks :)
We've actually changed positions on this a little bit, as this would still have us stuck hardcoding the 4 byte IDs that that OCC sends down in the poll response to know what the temperature values mean. The new idea is to just have hostboot use PLDM to send down those sensor IDs, and then modify the openpower-occ-control application, which already deals with PLDM calls for OCC active and also access the OCC sysfs, to start reading the sensors and putting them on D-Bus.
The issue is we haven't figured out model that in PLDM yet, though we have hostboot on board with the idea.
The plan now is that hostboot will put the PLDM entity ID and entity instance numbers into the 4 byte OCC ID fields for the temperature sensors, like 0xAAAABBBB.
DIMM sensors: entity ID 66 Core sensors: 135 The container ID for the DIMMs is 0x0 = system. For the cores, it is tentatively the container ID of the parent chip.
The OCC device driver makes the following hwmon sysfs files for each temperature:
tempX_input = temperature value, in degrees C tempX_label = The ID value tempX_fru_type: contains the FRU type field which is needed in the case of DIMMs to say which part of the DIMM the temp is for.
The possible fru types for the core and dimm sensors for P10, from the OCC spec: 0 = processor core 1 = internal mem ctlr 2 = DIMM 3 = mem ctlr + DIMM 7 = PMIC 8 = mem ctlr external sensor
So now, when the OCC becomes active and the hwmon driver is started, openpower-occ-control can
1 = dimm0_intmb_temp 2 = dimm0_dram_temp 7 = dimm0_dram_extmb_temp. 3 = dimm0_pmic_temp 8 = dimm0_extmb_temp c. entity ID 135, instance N = procX_coreN_temp, where X is the processor chip number, known because there is a separate hwmon directory for every processor(occ) and the code already uses that to find the directory in the first place.
There are also power sensors that need to be handled the same way. Those can be identified purely by their fru_type. The Power Sensors section in the OCC spec contains the full list, there's too many to put here. Witherspoon/mihawk extra already have examples of the names to use.
Other notes:
When the OCC isn't active, the sysfs files will not be present. So, before the first boot, no sensors will be on D-Bus. After the system is up with sensors created and it powers off and sysfs disappears, the sensors should remain on D-Bus. The sensor value can be set to NaN. In addition, there is an xyz.openbmc_project.State.Decorator.Availability d-bus interface with a bool Available property. This property can then track when the sensor value is valid, so it would be false when powered off or OCC otherwise not active.
When there is a file I/O reading something in sysfs, that means the device driver returns an error. The error cases are:
@spinler
All implementations are to replace phosphor-hwmon
, right?
Read all of the temps once per second, and put the results in the Value property a. /xyz/openbmc_project/sensors/temperature/dimm0_temp, etc
So, If the Dimm's fru_type value is 2, it is object path is the following:
/xyz/openbmc_project/sensors/temperature/dimm0_dram_temp
Right?
Also, Has this function been implemented on the hostboot
side(use PLDM model)?
For the service name, we need to use xyz.openbmc_project.Hwmon-1043099980.Hwmon1
instead of org.open_power.OCC.Control
?
Sorry, Maybe I am wrong, please correct me if I am wrong, Thanks a lot :)
@lxwinspur Yep, this will replace phosphor-hwmon. You're correct about the dimm0_dram_temp name, and keeping the same org.open_power busname is fine.
Hostboot has this targeted for April, and only on Rainier and probably Witherspoon Tacoma. To test this code before they are done, or if you don't have one of those systems, I think you could either a) update the MRW to change the IPMI_IDs on those OCC sensors to be what you want and build hostboot with it, or b) just make a throwaway internal function that maps the IPMI IDs coming down to the new ID values and call it right after you read the label files.
@spinler
About this issue, Can we start now?
@lxwinspur Sure. There is a commit in progress https://gerrit.openbmc-project.xyz/c/openbmc/openpower-occ-control/+/41176 that may be useful.
@spinler @mzipse
@ChicagoDuan of the IPS Team has worked on it and started coding.
Also, @spinler Could you kindly explain why use openpower-occ-control for OCC temps and power? Because I remembered that we control for OCC temps and power with phosphor-hwmon before, compared with the previous function, what are the benefits? Maybe we missed something.
Thanks a lot :)
If you're asking why we're using these PLDM IDs instead of standard hardcoded phosphor-hwmon config file IDs, it's because we're now supporting 4 different systems in the same flash image. The largest one having something like 500 different sensor values that someone would need to type into both our config files and the MRW XML that hostboot would get the values from. And, the people typing them in would have to make sure they are the same across all the systems since the config files would all be at the same spot based on the FSI locations.
If you're asking why we want them in the OCC app, aside from it totally changing the behavior of phosphor-hwmon just for the OCC, we've seen failures both in the labs and the field of getting read failure error logs because phosphor-hwmon can't quite keep up with the OCC changing states. Moving it into the OCC app allows it to keep better in sync, because the app is already keeping an eye on the OCC state and can start/stop sensor monitoring as it sees the states change.
Additionally, we've gotten complaints from the field about how the OCC sensors all go off of d-bus when the system powers off, instead of just showing not available.
I don't know if it will be useful, but @smccarney is currently writing sensor monitoring code for the VRMs that has much of the same behavior as the OCC monitoring will. Not that code would be literally shared, but he might have some design ideas.
@spinler
Thank you very much for providing us with a lot of useful information, that will be very helpful. Also, @ChicagoDuan is working on it, He will ping you if he has any questions.
Thanks again Matt
Now that the PLDM PDR design for cores is mostly defined, we know that:
So since we know from the fru_type anyway if the temp is for a core or a DIMM, we may need to rearrange the uint32 ID to instead have the DCM instance ID, parent processor instance ID, and core instance ID. This still needs to be run by hostboot.
@spinler
The plan now is that hostboot will put the PLDM entity ID and entity instance numbers into the 4 byte OCC ID fields for the temperature sensors, like 0xAAAABBBB.
We want to know which pldm command we need to monitor? sensorEvent? https://github.com/openbmc/openpower-occ-control/blob/master/pldm.hpp#L66
You won't need monitor any PLDM commands. The label file will contain the value that hostboot told the OCC to use for that sensor. The number in there will contain what you need to know to name the sensor.
That being said, we just had a meeting with the hostboot team today, and they proposed a new numbering scheme. I need to make sure I have it right and then I'll let you know.
@spinler Thanks for your mail. I am working on this task and have already started coding At present, I have a couple of questions that I need to confirm with you:
OCC Sensor Ids
- First byte is type
- Second byte is reserved
- Third+Fourth bytes are instance id
DIMM : D000iiii
- D0 = DIMM Sensor
- iiii = DIMM position relative to the backplane
= Equal to PLDM entity id within PDR
= Equal to Hostboot's ATTR_?? (TBD)
CORE : C000iiii
- C0 = Small Core DIMM Sensor
- iiii = Physical small core position relative to the parent processor chip
= Equal to PLDM entity id within PDR
= Equal to Hostboot's ATTR_CHIP_UNIT_POS
temp_label
file or temp_input
?dimm2
sensor is equal to 47
, should its value be like this? (D0020047)? Note that it is 47
instead of 47000
?temp_fru_type
, my understanding is to distinguish Core
& Dimm
, but for Dimm type
, where do you get it?
> 1 = dimm0_intmb_temp
> 2 = dimm0_dram_temp
> 7 = dimm0_dram_extmb_temp。
> 3 = dimm0_pmic_temp
> 8 = dimm0_extmb_temp
temp_fru_type
equal to 6?FRU Type - 1 byte. Indicates what type of FRU the temperature is for: 0x00: Processor (core temperature) 0x01: Internal Memory Controller 0x02: DIMM 0x03: Memory Controller + DIMM. Thermal sensor is located to cover both DRAM and MC. 0x04: GPU core 0x05: GPU memory 0x06: VRM Vdd 0x07: PMIC 0x08: Memory Controller external sensor
@spinler
temp_input holds the actual temperature values. temp_label is the file with the IDs
We assume that the temperature value of dimm2
is 47000
:
tempX_input = 47000
tempX_lable = D002iiii
Note:
D0 = DIMM Sensor
02 = instance ID
Correct?
Because:
OCC Sensor Ids
- First byte is type - D
- Second byte is reserved
- Third+Fourth bytes are instance id - 02
But I do not understand what the value of iiii
is?
Also, for the tempX_fru_type
:
FRU Type - 1 byte. Indicates what type of FRU the temperature is for:
0x00: Processor (core temperature)
0x01: Internal Memory Controller
0x02: DIMM
0x03: Memory Controller + DIMM. Thermal sensor is located to cover both DRAM and MC.
0x04: GPU core
0x05: GPU memory
0x06: VRM Vdd
0x07: PMIC
0x08: Memory Controller external sensor
I have some doubts here. We judge whether it is DImm
or Core
. It is judged by the first byte of tempX_label
(D or C)
So, if tempX_lable = D002iiii
and tempX_fru_type = 0
, what is the name of the D-Bus sensors?
Thanks a lot
I think I see the confusion because of something I didn't mention. This definition:
OCC Sensor Ids
- First byte is type
- Second byte is reserved
- Third+Fourth bytes are instance id
DIMM : D000iiii
- D0 = DIMM Sensor
- iiii = DIMM position relative to the backplane
= Equal to PLDM entity id within PDR
= Equal to Hostboot's ATTR_?? (TBD)
is for the uint32_t version of the ID, not a string. since that is what hostboot gives to the OCC, and what the OCC actually sends down in the poll response. The device driver then converts that into the string found in the label file.
So the uint32_t value is 0xD0000002, and the label would be "D00000002" (Though @eddiejames can confirm.)
For the other question, the reason the FRU type is embedded in the label is because the IDs are also what the OCC uses to callout FRUs for errors, so when hostboot gets those errors from the OCC it needs to be able to know the FRU type just from the ID since that is all it has, unlike for us where we also have the fru_type field of the poll response.
I think if the fru type embedded in the label field doesn't agree with the fru_type file, that probably implies that the label value is garbage and can't be trusted to build a sensor name out of. Maybe just trace that to the journal if it's a fru type we care about?
I don't think I mentioned this, but the OCC also sends down Vdd VRM temperatures in the poll response, and at this time those sensors do not need to be on d-bus, because the phosphor-regulators app is already handling that.
@spinler @mzipse
So the uint32_t value is 0xD0000002, and the label would be "D00000002" (Though @eddiejames can confirm.)
About this issue, I have pushed a new commit to verify the value of the tempX_label file https://gerrit.openbmc-project.xyz/c/openbmc/openpower-occ-control/+/43734
Could you review it? Thanks a lot :)
Get the temperature of processor and memory: ref: https://gerrit.openbmc-project.xyz/c/openbmc/openpower-occ-control/+/44256/4
@marthabroyles What all power sensor function IDs are valid in P10? All but the GPU ones?
Correct all but the GPU ones are valid.
Regarding the power sensors, they can also be identified by the label files. The values are in microwatts. There are two types:
power15_input:883000000
power15_label:system
<sensor id>_<function id>_<apss channel>
power5_input:6000000
power5_label:0_10_5
Function ID | Sensor Name |
---|---|
1 | p0_mem_power |
2 | p1_mem_power |
3 | p2_mem_power |
4 | p3_mem_power |
5 | p0_power |
6 | p1_power |
7 | p2_power |
8 | p3_power |
9 | p0_cache_power |
10 | p1_cache_power |
11 | p2_cache_power |
12 | p3_cache_power |
13 | io_a_power |
14 | io_b_power |
15 | io_c_power |
16 | fans_a_power |
17 | fans_b_power |
18 | storage_a_power |
19 | storage_b_power |
23 | mem_cache_power |
25 | p0_mem_0_power |
26 | p0_mem_1_power |
27 | p0_mem_2_power |
@marthabroyles @eddiejames Keep me honest.
I should add - the power values are only on one OCC, and not all systems will necessarily have all of them configured, so only a subset can come back in the poll response.
I should add - the power values are only on one OCC, and not all systems will necessarily have all of them configured, so only a subset can come back in the poll response.
Thanks Matt. How can we determine which OCC will have power sensors? Or is it always in the OCC1?
@marthabroyles Can you answer that?
It typically is the first OCC but does not need to be as this is hw dependent. The OCC connected to the APSS which is the chip that these power readings are read from is the OCC that will return the power readings in the poll response, this is the primary (aka controller) OCC. In the OCC poll response data the msb of the first byte of data indicates if this OCC is the "controller" if that bit is set then the power data will be present in that same poll response. I do not know how/if this is stored on the BMC. The BMC needs to know for other reasons which OCC is the "controller".
Thanks. @ChicagoDuan That OCC is referred to as the master in the code, and there is an API to check that.
@ChicagoDuan That OCC is referred to as the master in the code, and there is an API to check that.
Thanks.
Hi @ChicagoDuan, we just had a meeting this morning where it was requested the code put the Vdd VRM temps on D-Bus after all. Could you add that? Up above, it is listed as a 0x06.
Hi @ChicagoDuan, we just had a meeting this morning where it was requested the code put the Vdd VRM temps on D-Bus after all. Could you add that? Up above, it is listed as a 0x06.
Sure. I will add it.
Hi @ChicagoDuan, we just had a meeting this morning where it was requested the code put the Vdd VRM temps on D-Bus after all. Could you add that? Up above, it is listed as a 0x06.
What is the sensorID of VDD VRM temps ?
Sensor ID should not matter since there should only be a single Vdd VRM temp for an OCC
Thanks. I've added Vdd VRM temp to the code.
@spinler @mzipse Can we close this issue?
Merged.
This would have to be done after all of the other dbus-sensors changes we need are done