open-power / hostboot

System initialization firmware for Power systems
Apache License 2.0
74 stars 97 forks source link

What is the purpose of calling the ”platCreateGardRecord“ function to create Ephemeral gard records for BMC system? #238

Closed Theo0208 closed 8 months ago

Theo0208 commented 9 months ago

When applying gard records in the 'applyGardRecord' function, it will call the ”platCreateGardRecord“ function to create Ephemeral gard records for BMC system. I would like to know the purpose of creating Ephemeral gard records for BMC system,thanks.

dcrowell77 commented 9 months ago

The ephemeral records are used as semi-persistent deconfiguration records for Hostboot. We need our deconfigs to survive reconfig loops but not survive a system reboot. The BMC code treats the records differently.

Typically these would be used for non-guarding errors. However, we have to create "duplicate" records for guards to handle something called "resource recovery". This is a feature where Hostboot will ignore predictive guards if the application of them will result in not having enough resources to boot. There are some complex scenarios that require us to have this duplicate deconfig record along with the guard record in order to survive.

Category | Guard Type | Applied By BMC | Applied by HB | Cleared -- | -- | -- | -- | -- Persistent | Manual | On CleanBoot | If BSD==0 | Manual action from BMC Persistent | Fatal, Unrecoverable | On CleanBoot | Always | HB: Part replacement Persistent | Predictive | On CleanBoot | If BSD==0 | HB: Part replacement Ephemeral | Reconfig | Never | if BSD==0 | BMC: On CleanBootHB: After step16 Ephemeral | Sticky | Never | Always | HB: After step16HB: Replacement of any partBMC: On CleanBoot

BSD = ATTR_BLOCK_SPEC_DECONFIG = Resource Recovery

Note that we recently dropped support for resource recovery but we haven't cleaned up all of the extra pieces that we added to support it yet.

Theo0208 commented 9 months ago

Okay,thanks a lot!