ibm-openbmc / dev

Product Development Project Mgmt and Tracking
16 stars 2 forks source link

FVT1060:Invalid power cap value is set via IPMI. #3639

Open yadlapati opened 3 months ago

yadlapati commented 3 months ago

Description

======================

Invalid power cap value is set via IPMI.

System details

=================

System : bonn008(service/0penBmc0)

BMC fw : fw1060.00-4.33-1060.2410.20240308a (NL1060_033)

Steps to re- create

======================

Step1 : Login to system

Step2: Executing the IPMI command to set the power cap value  600 and 500 , Min and Max value is 583 and 3982:

Value 600:

bash-4.2$ ipmitool -I lanplus -C 17 -N 3 -p 623 -U ipmi_admin -H bonn008.aus.stglabs.ibm.com dcmi power set_limit limit 600

Password:

    Current Limit State: Power Limit Active

    Exception actions:   Hard Power Off & Log Event to SEL

    Power Limit:         600 Watts

    Correction time:     0 milliseconds

    Sampling period:     0 seconds

Value 600 is set successfully

Step3:

Value 500:

bash-4.2$ ipmitool -I lanplus -C 17 -N 3 -p 623 -U ipmi_admin -H bonn008.aus.stglabs.ibm.com dcmi power set_limit limit 500

Password:

    Current Limit State: Power Limit Active

    Exception actions:   Hard Power Off & Log Event to SEL

    Power Limit:         500 Watts

    Correction time:     0 milliseconds

    Sampling period:     0 seconds

The Value 500 is below the minimum value ,but its taking the value has 556

Actual behaviour: Able to set a invalid value(below the minimum power cap value ) and with no error response for invalid value its able to set a random value.

Expected Behaviour: Need to get a error for setting a below minimum power cap value .

yadlapati commented 3 months ago

From Chris Cain:

The IPMI request ends up writing the power cap to the dbus. occ-control gets notified after the cap gets written, and sees that it is out of bounds so it corrects it to a legal value.

The supported power limits are on dbus:

busctl -l introspect xyz.openbmc_project.Settings /xyz/openbmc_project/control/host0/power_cap

NAME TYPE SIGNATURE RESULT/VALUE FLAGS org.freedesktop.DBus.Introspectable interface - - - .Introspect method - s - org.freedesktop.DBus.Peer interface - - - .GetMachineId method - s - .Ping method - - - org.freedesktop.DBus.Properties interface - - - .Get method ss v - .GetAll method s a{sv} - .Set method ssv - - .PropertiesChanged signal sa{sv}as - - xyz.openbmc_project.Control.Power.Cap interface - - - .CorrectionTime property t 0 emits-change writable .ExceptionAction property s "xyz.openbmc_project.Control.Power.Cap.ExceptionActions.NoAction" emits-change writable .MaxPowerCapValue property u 2777 emits-change writable .MinPowerCapValue property u 1286 emits-change writable .MinSoftPowerCapValue property u 556 emits-change writable .PowerCap property u 2777 emits-change writable

occ-control could log an error when it has to modify the cap to be within range, but the IPMI command would still be successful (user would NOT know unless they looked for PELs after the set) I would expect the ipmi command to fail if the limit was not in a valid range (between MinSoftPowerCapValue and MaxPowerCapValue).

I do see the following yaml file which should be establishing some limits for power cap: https://github.com/ibm-openbmc/openbmc/blob/1050/meta-ibm/recipes-phosphor/settings/phosphor-settings-read-settings-mrw-native/mrw-override-settings.yaml

Not sure how that settings would be enforced (obviously it is currently not enforced).

yadlapati commented 3 months ago

EWM defect is https://jazz07.rchland.ibm.com:13443/jazz/web/projects/CSSD#action=com.ibm.team.workitem.viewWorkItem&id=602199

mzipse commented 3 months ago

@lxwinspur , is this something you or someone on the IPS team could investigate?

lxwinspur commented 3 months ago

@mzipse Our new motherboard (FP5280G3) has just come back, and we are busy debugging the new motherboard during this period, so we have not paid attention to these issues yet. Could IBM please take a look first?

lxwinspur commented 3 months ago

I do see the following yaml file which should be establishing some limits for power cap: https://github.com/ibm-openbmc/openbmc/blob/1050/meta-ibm/recipes-phosphor/settings/phosphor-settings-read-settings-mrw-native/mrw-override-settings.yaml

@yadlapati Which branch did you test on? I saw that branch 1110 is not adapted to Validation. https://github.com/ibm-openbmc/openbmc/blob/1110-public/meta-ibm/recipes-phosphor/settings/phosphor-settings-defaults-native/ibm_host_settings.override.yml

mzipse commented 2 months ago

@lxwinspur , testing found this from our 1060 branch.

mzipse commented 2 months ago

@lxwinspur , after further discussions with Chris Cain, our Power Mgmt/OCC Control expert, we are going to live with this problem. Our thinking is that even if an invalid power cap is set, the firmware will still correct it internal to a legal limit. Also, most of our IBM customers will be using Redfish and the GUI to set the power cap which is handled correctly.

If you think this needs to be fixed, I'll let you investigate. But if you are ok that there is no error posted when an invalid power cap is set via IPMI, then you can go ahead and close this issue.

lxwinspur commented 2 months ago

@lxwinspur , testing found this from our 1060 branch.

Hi, @mzipse I saw that branch 1060-public is not adapted to Validation. @yadlapati could please check this yaml file? https://github.com/ibm-openbmc/openbmc/blob/1060-public/meta-ibm/recipes-phosphor/settings/phosphor-settings-manager/ibm_settings.override.yml

cjcain commented 2 months ago

@lxwinspur I am not exactly sure what you are saying. There was a commit that removed the default pcap values from that file: https://gerrit.openbmc.org/c/openbmc/openbmc/+/52251 The mrw limits were specified here: https://github.com/openbmc/openbmc/blob/master/meta-ibm/recipes-phosphor/settings/phosphor-settings-read-settings-mrw-native/mrw-override-settings.yaml

occ-control currently gets the cap limts directly from the OCC and writes them to the dbus: https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Control/Power/Cap.interface.yaml occ-control does pcap validation for Redfish/boot here: https://github.com/openbmc/openpower-occ-control/commit/81c8343054f722b3a8200b5f818dfbd9b0292c46