This PR adds support for the U. Tokyo cluster, Wisteria, which is a Fujitsu brand cluster with it's own workload management software.
Changelog:
system.Cluster
Makes the location of the submit and run scripts a Class variable which can be overwritten.
This does not affect the existing Workflow sub-modules, but allows the Fujitsu sub-module to override this and point to a different script.
system.Fujitsu:
Creates a Fujitsu system sub-module which adopts from Cluster.
Acts as the general system-interaction framework, similar to Slurm or PBS
Allows for specific sub-arguments like rscgrp (research group)
Also contains the architecture for submitting jobs, monitoring the queue (using pjstat) etc.
system.Wisteria:
Inherits from Fujitsu to provide more specific arguments for the Wisteria system.
Custom run and submit scripts for Wisteria:
These custom scripts were required because on Wisteria, the compute node does not inherit from the login node's environment, meaning everything must be re-loaded prior to submitting a workflow or running a job (e.g., modules, Conda environment)
Paths, environment and loaded modules are hard-coded for Kyoto group, not generalized
If we wanted to make this more general, these scripts might have to be generated on the fly or created from paths/parameters defined in the parameter file
For now we leave this somewhat hardcoded to get research problems going
Notes from System.Wisteria docstring:
Wisteria Caveat 1: On Wisteria you cannot submit batch jobs from compute nodes and you cannot SSH from compute nodes (Manual 5.13), so the master job must be run from the login node or the pre-post node (Manual 5.2.3)
Wisteria Caveat 2: On Wisteria, the login node Conda environment is not inherited by compute nodes, so it requires custom submit and run script which first load the correct modules, and then run the corresponding script
Wisteria Caveat 3: On Wisteria, command line arguments for the submit and run script, normally input like '--key value' interfere with the batch submission cmd pjsub. So instead we use the pjsub '-x' flag which allows us to set environment variables. We use these in place of command line arguments
Relevant Issue: #161
This PR adds support for the U. Tokyo cluster, Wisteria, which is a Fujitsu brand cluster with it's own workload management software.
Changelog:
system.Cluster
submit
andrun
scripts a Class variable which can be overwritten.Fujitsu
sub-module to override this and point to a different script.system.Fujitsu
:Cluster
.Slurm
orPBS
system.Wisteria
:Fujitsu
to provide more specific arguments for the Wisteria system.Notes from System.Wisteria docstring:
submit
andrun
script which first load the correct modules, and then run the corresponding scriptsubmit
andrun
script, normally input like '--key value' interfere with the batch submission cmdpjsub
. So instead we use thepjsub
'-x' flag which allows us to set environment variables. We use these in place of command line arguments