arcress0 / ipmiutil

ipmiutil is an easy to use set of IPMI server management utilities. It can get/set sensor readings & thresholds, automate SEL management, do SOL console, etc. Supports Linux, Windows, BSD, Solaris, MacOSX. The only IPMI project tool that runs natively on Windows. See http://ipmiutil.sf.net for rpms, etc. (formerly called panicsel). It can run driverless in Linux for use on boot media or embedded environments.
BSD 3-Clause "New" or "Revised" License
33 stars 5 forks source link

Failed to set power limit and power limit action via ipmiutil dcmi power command #11

Open vien20010 opened 1 year ago

vien20010 commented 1 year ago

Step to reproduce:

  1. Change power limit value ipmiutil dcmi power set_limit 1000 2.Change power limit action ipmiutil dcmi power set_action power_off

Expected result: Successfully setting power limit and power limit action

Actual result: Fail to set power limit and power limit action. Got error:

DCMI Power Limit Set error 204
ipmiutil dcmi, Invalid data field in request
arcress0 commented 1 year ago

I haven't seen that before. If you could send me the same command with -x added for debug, it should indicate both the firmware vendor/version and which field in the request it doesn't like.

vien20010 commented 1 year ago

ipmiutil dcmi set_limit command will send ipmi raw command DCMI SET LIMIT. But ipmiutil send wrong request data format, that makes an invalid data field error.

>> Sending IPMI command payload
>>    netfn   : 0x2c
>>    command : 0x04
>>    data_len: 15
>>    data    : 0xdc 0x00 0x00 0x01 0xd0 0xd0 0x07 0x03 0x00 0x00 0x00 0x00 0x05 0x00 0x00

dcmi-v1-5 spec https://www.intel.com/content/dam/www/public/us/en/documents/technical-specifications/dcmi-v1-5-rev-spec.pdf, section 6.6.3. Set Power Limit image

Byte 2:4 must be 0x00 follow ipmi spec. But ipmiutil send 4th byte is 0x01 → error. The right data format must be:

0xdc 0x00 0x00 0x00 0x01 0xd0 0x07 0xe8 0x03 0x00 0x00 0x00 0x00 0x05 0x00

I also attached cmd log with debug option below. Please review it and give me any feedback. ipmiutil_set_limit.log

arcress0 commented 1 year ago

This shows a missing byte 4 in the implementation. I'm struggling to see how this got off by one, since that looks right to me in the code and it previously worked on some firmware. I'll add some debug to find out.

arcress0 commented 1 year ago

I looked, and from what I see, the new values are being put in the right positions.
All of the bytes in that buffer are filled in by the DCMI get function, then only the byte(s) that change are overwritten before doing the set function. If you can run this with debug (add -x to the end of the command) it will show what was read with a line like 'dcmi_get_power_limit(%d): ...".

arcress0 commented 1 year ago

Here is that data from the log you provided:

dcmi_get_power_limit: rv = 0 rlen = 14
dc 00 00 01 d0 07 e8 03 00 00 00 00 05 00

This shows that the firmware returned only 14 bytes instead of 15, and was missing the 00 at the 4th byte. Then ipmiutil uses this to send back and gets the error.

arcress0 commented 1 year ago

So the root cause is a firmware bug in get_power_limit.
The only workaround for this is to use 'ipmiutil cmd' to write the correct bytes instead of using what the firmware returns. This firmware bug is with vendor=52538 (0xCD3A) prod=3