OpenCHAMI / roadmap

Public Roadmap Project for Ochami
MIT License
2 stars 0 forks source link

[RFD] Defining what is out of scope for OpenCHAMI #51

Open alexlovelltroy opened 2 months ago

alexlovelltroy commented 2 months ago

CSM and other HPC System Management tools are expansive in what they provide. OpenCHAMI seeks to be more compact and composable, allowing sites to make different choices for different use cases and workflows. This RFD seeks to be a clearing house for discussing what OpenCHAMI will not do and how sites may choose to handle these items.

Image Build Pipelines

OpenCHAMI provides APIs to indicate remote URLs for system images, kernels, and initrds. It provides APIs for updating these fields across groups of nodes based on selectors in the inventory system. It does not have a built-in capacity for building system images and storing them.

Compute Node Operating System

OpenCHAMI seeks to be as OS agnostic as possible. We have seen successful deployments of TOSS, Alma Linux, RHEL, and Rocky Linux with the OpenCHAMI stack so far and there's no reason to expect that other HPC systems would be incompatible with OpenCHAMI.

Considerations

Bootstrapping Secure Attestation and using cloud-init with OpenCHAMI may require specific versions of the related client software. If the image doesn't include the relevant versions, those features will not be available. As OpenCHAMI develops, we may introduce additional software, delivered as binaries, containers, and/or RPMS that make enable advanced features of OpenCHAMI. By remaining OS agnostic, it is incumbent on the OpenCHAMI development team to provide client applications in a way that can be easily adapted to any HPC Operating System with obvious preference for those in use by partners.

Logging Infrastructure

Sites generally have their own preferred tooling for collecting and analyzing logs and metrics. OpenCHAMI will not choose a standard tool for either of these activities. However, the APIs of OpenCHAMI should make it easy to configure the compute nodes to use whatever upstream tool is appropriate by providing configuration information through cloud-init instance meta-data.