openbmc / phosphor-state-manager

Apache License 2.0
11 stars 21 forks source link

Provide mechanism to synchronize services when a bmc reset occurs with chassis power on #19

Open geissonator opened 2 years ago

geissonator commented 2 years ago

IBM recently ran into an issue where a service got out of synch due to a certain bmc reboot scenario. The fsi-scan and cfam-reset services (both specific to IBM systems) have to be run in both the chassis power on target and the host start target. This is to handle IBM's cronus debug tool requirements.

During a normal power on, they run in the chassis power on target and then are not run in the host-start. But if you were to just power on the chassis, and then reboot the BMC, and then initiate a host power on, you run into a weird scenario where the cfam-reset is run, but not the fsi-scan. This is due to how we support bmc resets with the host running (only run fsi-scan, not cfam-reset).

Anyway, since this is a very IBM specific issue and only affect this one service, a quick fix was put into a downstream fork (https://github.com/ibm-openbmc/phosphor-state-manager/pull/4) but that's not acceptable for upstream. Use this issue to track a real solution which I think is going to be a separate target that is started to indicate this scenario has occurred. That will allow service to have an appropriate Conflicts if needed.