ULHPC / puppet-slurm

A Puppet module designed to configure and manage SLURM(see https://slurm.schedmd.com/), an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters
Apache License 2.0
19 stars 24 forks source link

Consolidate vagrant setup for module validation on a virtual cluster #17

Closed Falkor closed 5 years ago

Falkor commented 5 years ago

Vagrant-based deployment currently focus on a single VM et quickly tests a few testing manifests but does not permit to really to test the slurm module in an environment as close as possible to a cluster. In particular, the flexibility that could offer hiera for custom checks in a multi-VM deployment (each with dedicated roles) is not yet addresed.

Objectives

  1. We should allow for a full virtual cluster setup by default, including:

    • 1 Slurm controller (including the Slurm accounting DB)
    • 1 login node
    • 2 or more compute nodes
  2. Provisionning of each VM should be operated by vagrant/puppet using the current module and sample profile classes which, coupled with Hiera, would illustrate the usage of the module as done in the ULHPC control repo.

  3. Modifying/customizing the default deployment setup (number of compute nodes, separation of the Slurm accounting DB on a separate host etc.) should be made flexible through changes in a single config.yaml file that would permit to overwrite the default settings

  4. Hiera hierachy with default settings for the module should be proposed (to illustrate again the way we use this module) and, more importantly, should also allow for quick customs tests to be perfomed by placing hieradata/custom.yaml (not tracked in the Git repository) at the highest hierachy, to allow for a fast application accross all or a subset of the deployed VMs with vagrant provision --provision-with puppet