snakemake / snakemake-executor-plugin-slurm

A Snakemake executor plugin for submitting jobs to a SLURM cluster
MIT License
18 stars 19 forks source link

Make the `slurm_account` parameter optional, because it is in fact optional. #76

Closed xapple closed 6 months ago

xapple commented 7 months ago

Some SLURM installations don't require specifying an account when using sbatch or salloc.

Currently the value -A (null) is actually used in the sbatch call if you don't mention one:

rule xxxxxxxxxxxx:
    output: xxxxxxxxxxxxxx
    jobid: 4
    reason: Missing output files: xxxxxxxxxxxxxx
    wildcards: dataset=xxxxxxxxxxxxxx, sample=xxxxxxxxxxxxxx
    resources: mem_mb=9193, mem_mib=8768, disk_mb=1000, disk_mib=954, tmpdir=<TBD>, slurm_partition=cpu, nodes=1, cpus_per_task=1, runtime=60

No SLURM account given, trying to guess.
Guessed SLURM account: (null)
WorkflowError:
SLURM job submission failed. The error message was /bin/sh: 1: Syntax error: "(" unexpected

And if you set it to some placeholder value such as "default" for instance, this happens:

rule xxxxxxxxxxxx:
    output: xxxxxxxxxxxxxx
    jobid: 4
    reason: Missing output files: xxxxxxxxxxxxxx
    wildcards: dataset=xxxxxxxxxxxxxx, sample=xxxxxxxxxxxxxx
    resources: mem_mb=9193, mem_mib=8768, disk_mb=1000, disk_mib=954, tmpdir=<TBD>, slurm_partition=cpu, nodes=1, cpus_per_task=1, runtime=60

WorkflowError:
Unable to test the validity of the given or guessed SLURM account 'default' with sacctmgr: You are not running a supported accounting_storage plugin
(accounting_storage/filetxt).
Only 'accounting_storage/slurmdbd' and 'accounting_storage/mysql' are supported.
cmeesters commented 7 months ago

Just taught a course on a cluster, with a default resp. None-account. Worked like a charm.

So, I wonder, what is the output of sacct -nu $USER -o Account%256 for you?

xapple commented 7 months ago

Here is the output:

                                                                                                                                                                                                                                                          (null)
                                                                                                                                                                                                                                                          (null)
                                                                                                                                                                                                                                                          (null)
                                                                                                                                                                                                                                                          (null)
                                                                                                                                                                                                                                                          (null)

This pattern repeats for 702 lines.

cmeesters commented 6 months ago

That is really weird. But I might be able to work with that under the assumption, that is “only” some sort of whitespace.

cmeesters commented 6 months ago

Just copy-pasted your literal feedback - thank you! - into an editor. It indeed contains a hidden (null) which is interpreted by Python. If it were just white space (an empty string, the line feeds get stripped away before passed to Python), it would work already.

Now, I wonder: What is this (null)-character exactly, and why is it produced in the first place? Please run this snippet and attach the resulting file.

#!/usr/bin/env python3

import os
import subprocess

cmd = f'sacct -nu "{os.environ["USER"]}" -o Account%256 | head -n1'

sacct_out = subprocess.check_output(
                cmd, shell=True, text=True, stderr=subprocess.PIPE
            )

with open('sacct.out', 'wb') as outfile:
    outfile.write(sacct_out)

PS What is the output of the locale command for you?

xapple commented 6 months ago

The string (null) is "hidden" in the output above because the command specifies the insertion of 256 whitespace characters and right-aligns by default. If one used sacct -nu $USER -o Account%16 it would be easier to read.

The script you provided raises a TypeError, because the file is opened in bytes mode and not text mode. Also the .out extension is forbidden on GitHub issues uploads. After fixing here is the output attached. A single line with 256 whitespace characters and the string (null) at the end, as expected.

sacct.txt

cmeesters commented 6 months ago

The script you provided raises a TypeError

My bad, I should have included encode(). However, please test the code in PR #81 . I am not sure whether it will work for you. Should not break any other code, though.

If it does not work, we need to check more details.

xapple commented 6 months ago

Thanks for looking into this.

I tested this unmerged pull request like this:

conda activate myproj
conda uninstall -y snakemake-executor-plugin-slurm
git clone https://github.com/snakemake/snakemake-executor-plugin-slurm.git
cd snakemake-executor-plugin-slurm
git fetch origin pull/81/head:pr-81
git checkout pr-81
pip install .

And at the end:

pip uninstall snakemake-executor-plugin-slurm
conda install -y snakemake-executor-plugin-slurm
conda deactivate myproj

The result is the following being printed:

 No SLURM account given, trying to guess.
 Unable to guess SLURM account. Trying to proceed without.

So it worked perfectly 👌🏻