CLIP-HPC / SlurmCommander

Slurm TUI
MIT License
60 stars 6 forks source link
hpc slurm terminal tui

SlurmCommander

News:

Slurm 23.02.0

Slurm 23 has been released, and as already reported here scom does not work with it.

The issue contains details and will track any potential progress on the support for slurm 23.


Discussions are open!

Wishlist discussion thread: here

Description

SlurmCommander is a simple, lightweight, no-dependencies text-based user interface (TUI) to your cluster. It ties together multiple slurm commands to provide you with a simple and efficient interaction point with slurm.

Installation does not require any special privileges or environment. Simply download the binary, fill out a small config file and it's ready to run.

You can view, search, analyze and interact with:

Job Queue

Job Queue shows jobs currently in the queue, additional information and breakdowns can be turned on with s,c and i keys: Job Queue main window

Enter key opens menu window with different actions available based on the job state (RUNNING, PENDING, etc.) Job Queue actions window

\ turns on filtering. It works by concatenating multiple job columns into a single string, and accepts golang re2 regular expressions thus allowing you to do some very imaginative filters.

Example: grid.*alice.\*(RUN|PEND) =~ jobs from account grid, user alice, in RUNNING OR PENDING state Job Queue filtering

Job history

Browse, filter and inspect past jobs Job History tab

Job Details tab

Edit and submit jobs from predefined templates

Job from Template tab Job from Template tab

Examine state of cluster nodes and partitions

Cluster tab

Same as with Job Queue and Job History tabs, filtering is available here. Cluster tab filtering

So if we would like to see only nodes whose name contains clip-c that are idle and POWERED_DOWN, we can easily filter those out with a filter: clip-c.*idle.\*POWER Cluster tab filter on

Example Job Queue tab demo:

demo

Installation

SlurmCommander does not require any special privileges to be installed, see instructions below.

Hard requirement: json-output capable slurm commands

Regular users

  1. Download the pre-built binary
  2. Download the annotated config file
  3. Edit the config file, follow instructions inside
  4. Create scom directory in your $HOME and place the edited config there: mkdir $HOME/scom
  5. Run

Site administrators

Instructions are same as for the regular users, with one minor perk. Place the config file in one of the following locations to be used as global configuration source for all scom instances on that machine.

  1. /etc/scom/scom.conf
  2. Any location, providing users with the environment variable SCOM_CONF containing path to config. file
  3. Users $XDG_CONFIG_HOME/scom/scom.conf

NOTE: Users can still override global configuration options by changing config stanzas in their local $HOME/scom/scom.conf

Usage tips

SlurmCommander is developed for 256 color terminals (black background) and requires at least 185x43 (columns x rows) to work.

[pja@ SlurmCommander-dev]$ [DEBUG=1] [TERM=xterm-256color]./scom -h
Welcome to Slurm Commander!

Usage of ./scom:
  -d uint
        Jobs history fetch last N days (default 7)
  -t uint
        Job history fetch timeout, seconds (default 30)
  -v    Display version

To run in debug mode, set DEBUG env. variable. You will see an extra debug message line in the user interface and scom will record a scdebug.log file with (lots of) internal log messages.

Tested on:

  • slurm 21.08.8
  • slurm 22.05.5

Feedback

Is most welcome. Of any kind.

Acknowledgments

Powered by amazing Bubble Tea framework/ecosystem. Kudos to glamurous Charm developers and community.