open-power / op-build

Buildroot overlay for Open Power
GNU General Public License v2.0
103 stars 183 forks source link

IPMI OEM NetFn used by opfw is wrong #2826

Open AlexanderAmelkin opened 5 years ago

AlexanderAmelkin commented 5 years ago

We at YADRO have stumbled at the fact that in many places throughout multiple components of OpenPOWER Firmware (at least hostboot and skiboot), when OEM IPMI commands are sent to BMC, a wrong Network Function code is used.

Namely, the used code is 0x3A. According to IPMI 2.0 specification table 5-1, this is a "Controller- specific OEM/Group" code and "The Manufacturer ID associated with the controller implementing the command identifies the vendor or group that specifies the command functionality."

While this may work for complete single-vendor solutions (like proprietary IBM's controlers with IBM's firmware on them), it for sure violates the specification for any ODM-manufactured OEM controller with AMI or OpenBMC firmware on board, for those controllers must report their vendor's ID (e.g. YADRO or IBM) in Get Device ID, while implementing multiple OEM command sets specific both to the aforementioned vendor of the equipment and to the vendor of the used firmware (AMI or OpenBMC). Thus, a compliant IPMI client should check the Get Device ID and refuse to use IBM-specific protocol with NetFn 0x3A if the returned ID is not of IBM. OpenPOWER firmware does not do that. It blindly sends 0x3A commands regardless of the actual Manufacturer ID.

We would like to propose using "OEM/Non-IPMI group Requests and Response" command 0x2E/0x2F instead. That command incorporates the OEM code in the first 3 bytes of requests and responses and thus can be used with BMCs implementing multiple OEM command sets.

sammj commented 5 years ago

Right, Petitboot at least will send an AMI-specific request on boot to get alternate-side version information. We could be smarter and check what we're running on top of before asking, same for Skiboot/Hostboot if they're doing it too. It would be worth too making a list of all the places we make an OEM-specific request. That said, are there consequences of this beyond the BMC reporting errors for unsupported requests?

ghost commented 5 years ago

Sam Mendoza-Jonas notifications@github.com writes:

That said, are there consequences of this beyond the BMC reporting errors for unsupported requests?

I mean, I guess it would be "okay" for a BMC to implement an OEM command of "create fire" and we could accidentally call that. :)

Personally though, the months are getting colder here, so fire may be a feature :)

-- Stewart Smith OPAL Architect, IBM.

AlexanderAmelkin commented 5 years ago

I mean, I guess it would be "okay" for a BMC to implement an OEM command of "create fire" and we could accidentally call that. :)

That's exactly our point!

dcrowell77 commented 5 years ago

(IPMI ignoramus here) So we would change every IPMI command to sending these extra 3 bytes before each command?

How can we know if a given BMC implementation supports this? I doubt we'll be able to get our current BMC vendors to add more code and OpenBMC is transitioning away from IPMI as well.

and refuse to use IBM-specific protocol with NetFn 0x3A if the returned ID is not of IBM

What do we do in this case? Many of these functions are fundamental to our ability boot, e.g. the hiomap functions that let us access pnor. If the BMC doesn't accept those then we can't boot.

Side note - Can someone point me to the list of OEM numbers? My google-fu struck out other than a few random files from sourceforge.

AlexanderAmelkin commented 5 years ago

Can someone point me to the list of OEM numbers?

They come from IANA PEN registry: https://www.iana.org/assignments/enterprise-numbers/enterprise-numbers

AlexanderAmelkin commented 5 years ago

So we would change every IPMI command to sending these extra 3 bytes before each command?

Well, at least that's what the IPMI specification would expect.

How can we know if a given BMC implementation supports this?

If it doesn't, it is expected to explicitly reply with "Invalid command" code (0xC1). In case with the currently used NetFn 0x3A you'll never know if the BMC actually did what you think it did even if it replied with OK (0x00).

I doubt we'll be able to get our current BMC vendors to add more code and OpenBMC is transitioning away from IPMI as well.

Well, since opfw is usually bundled with some BMC firmware that expects certain behavior from it, it may be safe to stay with 0x3A. But it's just not compliant with the IPMI spec.