access-ci-org / Jetstream_Cluster

Scripts and Ansible Playbooks for building an HPC-style resource in Jetstream
MIT License
19 stars 16 forks source link

Scripts to turn existing instance into Slurm head node #6

Closed julianpistorius closed 2 years ago

julianpistorius commented 2 years ago

Background

Jetstream2 will provide push-button clusters. Exosphere is one of the supported graphical user interfaces for Jetstream2. Exosphere is a 'pure client' application - this means that it requires no persistent services to interact with the OpenStack API. Currently the scripts in this repository will only work when executed on a computer/server that has the OpenStack command-line tools installed. This presents a problem for a tool like Exosphere, since the instance orchestration logic for Exosphere runs entirely in the browser. See the following issue on the Exosphere repository: https://gitlab.com/exosphere/exosphere/-/issues/636

The Exosphere developers considered the following three options:

A. Exosphere could do all this directly with the OpenStack API using the Exosphere orchestration engine
B. Exosphere could create a throw-away instance which runs cluster_create.sh to create the head node, and then deletes the throw-away instance
C. We could create a modified version of the cluster_create.sh script which runs on the head node itself, after it's been created by Exosphere

We decided to implement option C, with help from XSEDE Cyberinfrastructure Resource Integration staff. This pull request contains the resulting modifications in order to launch elastic Slurm clusters using Exosphere.

Some notable changes

Note: These scripts are useful outside of the Exosphere client, and can be used from Horizon or any OpenStack client by adding the following snippet of shell script to the cloud-init of a new OpenStack instance:

su - centos -c "git clone --branch cluster-create-local --single-branch https://github.com/julianpistorius/CRI_Jetstream_Cluster.git; cd CRI_Jetstream_Cluster; ./cluster_create_local.sh"

To test this using Exosphere, go to https://exosphere.jetstream-cloud.org and follow these instructions: https://gitlab.com/exosphere/exosphere/-/merge_requests/587#how-to-test

Once this PR is merged we will change the Exosphere code to reference this repository instead of my fork, and the main branch instead of the cluster-create-local branch.

DImuthuUpe commented 2 years ago

Awesome. Thanks a lot for the PR @julianpistorius. I am going to merge it.