sacctmgr error when running in job context

Two things prevent the use of this plugin on our cluster:

The slurm.cfg file used for the cluster is in a non-standard location
Our admins prefer us to not run anything on the login node

In order to make many of the common SLURM tools work, users of our cluster need to have SLURM_CONFIG set in their environment. Since all environmental variables prefixed with 'SLURM_*' are wiped if the plugin sees SLURM_JOB_ID, this results in sacctmgr and sinfo exiting with an error:

WorkflowError:
Unable to test the validity of the given or guessed SLURM account 'xyz' with sacctmgr: sacctmgr: error: resolve_ctls_from_dns_srv: res_nsearch error: Unknown host
sacctmgr: error: fetch_config: DNS SRV lookup failed
sacctmgr: error: _establish_config_source: failed to fetch config
sacctmgr: fatal: Could not establish a configuration source

This seems like an unintended consequence and could be easily fixed by not removing SLURM_CONFIG. The issue can be avoided by running:

unset SLURM_JOB_ID

Personally, I think it would be nice to set the values for slurm_account / slurm_partition via env vars (as srun / sbatch do) and, to me this seems like a sensible way to determine a default value.

Thanks for your work and continuing to help the community!

$ snakemake --version
8.25.1

$ mamba list | grep "snakemake-executor-plugin-slurm"
snakemake-executor-plugin-slurm 0.11.1             pyhdfd78af_0    bioconda
snakemake-executor-plugin-slurm-jobstep 0.2.1              pyhdfd78af_0    bioconda

$ sinfo --version
slurm 23.02.7

snakemake / snakemake-executor-plugin-slurm

sacctmgr error when running in job context #164