Simplification of Booting Procedure (Boot Level).

Phase 2: Simplification of Booting Procedure (Boot Level).

The whole “boot level” concept is unnecessarily complex. I feel the entire code “bmInitialize”, “bmStart”, etc can be removed and replaced with a thread that simply does:

A. start safplus_logd and safplus_gms if not already started see above. Also start AMF threads (not sure when this happens today).

B. Start the rest of the SAFplus services, based on the XML/database definition Today these services require that a cluster master exists. This is a chickenandegg problem because this node is not actually ready to become master until these very services come up. So a single SC cluster theoretically cannot boot up. Today this is resolved by the master falsely claiming it can become master when in fact it cannot. This can cause nasty issues where the cluster database is lost (never got synchronized) during backtoback failovers that should be accounted for as a catastrophic fault but are not.

C. At the same time, all SAFplus services are tolerant of brief intervals where no master exists (during failover). So please modify the SAFplus services to be tolerant of nomaster at startup.

D. Wait until they come up (should not even be needed b/c user services can tolerate failover so why not service outage because not started yet).

E. Does the system issue a “node is ready event” today? If so it goes here. If not we need to add one.

F. Let cluster master AMF start bringing up user’s services.

OpenClovis / SAFplus-Availability-Scalability-Platform

Simplification of Booting Procedure (Boot Level). #117