tinkerbell / pbnj

Service for interacting with BMCs
Apache License 2.0
103 stars 35 forks source link

Want ability to send NMI #81

Open bahamat opened 3 years ago

bahamat commented 3 years ago

Expected Behaviour

A method to send a non-maskable interrupt to a device (server)

Current Behaviour

This is currently not possible

Possible Solution

An API call that will execute something like:

ipmitool <options> chassis power diag

Steps to Reproduce (for bugs)

N/A

Context

For some operating systems, sending an NMI will initiate a panic and crash dump. The crash dump can be analyzed post-mortem. This can sometimes be needed when the system is non-responsive to external input.

While we can initiate a reboot via the API, that does not allow for post-mortem debugging.

Your Environment

bahamat commented 3 years ago

This would be a precursor to enabling a public API for allowing end users to send an NMI.

https://feedback.equinixmetal.com/platform/p/want-nmi

jacobweinstock commented 3 years ago

Hey @bahamat, thanks for your interest in PBnJ and for opening this issue.

Are you looking for this kind of functionality in Equinix Metal? With the link and the references to "the API", seems like you might be?

We might need to approach this concern via other channels as this repo is just the open-source side. I'll ask around internally here at Equinix Metal and see what the options are.

jacobweinstock commented 3 years ago

@bahamat, looks like you've done what is needed on the Equinix Metal side with https://feedback.equinixmetal.com/platform/p/want-nmi We'll have to wait and see what happens there.

As far as the open-source side, this could be useful. PBnJ uses bmclib under the hood for most of its BMC interactions. I'd recommend maybe starting there.

bahamat commented 3 years ago

@jacobweinstock Thanks for the pointer. I created bmc-toolbox/bmclib#233.

nshalman commented 2 years ago

No action yet on https://github.com/bmc-toolbox/bmclib/issues/233