oxidecomputer / hubris

A lightweight, memory-protected, message-passing kernel for deeply embedded systems.
Mozilla Public License 2.0
2.96k stars 169 forks source link

We need to be able to capture and report sequencer register state on units in the rack #1590

Open nathanaelhuffman opened 8 months ago

nathanaelhuffman commented 8 months ago

Right now we have no visibility to sequencer fpga registers, would like to get some visibility here to faciliate debugging production issues where we don't have udprpc or other means.

Wonder if there is a way to do a special SP dump that could include a SPI dump from the sequencer or something along these lines?

cbiffle commented 8 months ago

It'd be tricky to do this as part of a memory dump specifically, but we could have a different verb that asks the sequencer to pull them out. Assuming the sequencer task comes up -- if it decides anything has failed, it will refuse to start its RPC server, and we're sort of out of luck. (I would like to fix this but haven't gotten to it.)

If we did an explicit command for this, it'd probably go and interrogate the registers synchronously. Remind me how much register space we're talking?

nathanaelhuffman commented 8 months ago

right now we have 0x0-0x35 address space defined, so 54 bytes in total.

cbiffle commented 8 months ago

Cool, so there's no need to have it be multiple requests for space reasons or anything. I would think we could do an IPC into the sequencer that'd retrieve the current contents in one message. We'd need to route that out to the network in a way that people can get to, probably through controlplaneagent, maybe through dumpagent.

cbiffle commented 8 months ago

Alright, #1597 added a first version of this. There's a separate UDP port for it, and Humility support will be coming soon. The code made it in for the freeze and should be in the next customer release of the SP firmware (or anything built since about an hour ago, on dev systems).

cbiffle commented 8 months ago

Humility support is merged, btw, but I haven't bumped the version yet because there are a couple other things I'm expecting to have in very shortly.