-
Version: 6725424
Environment: ha2-controllers
To reproduce:
On a working HA-cluster, plug out the power on the active controller and plug it in after the other one has taken over (this went smoot…
-
We are running canu job and with
1. scheduling selectType=select/cons_res
```
# SCHEDULING
#SchedulerType=sched/backfill
#SchedulerAuth=
#SchedulerPort=
#SchedulerRootFilter=
#SelectType=sele…
-
Hi,
Using the command
```
docker run -it -h ernie giovtorres/docker-centos7-slurm:17.02.9
```
the program 'slurmctld' is not running:
```
[root@ernie /]# supervisorctl status
munged …
-
I'm using the slurm container for various tests and would like to monitor the status of jobs using the sacct command. I fire up the container:
`
docker run -it -h ernie giovtorres/docker-centos7-s…
-
I am trying to bootstrap a SLURM cluster in AWS with the latest master branch of elasticluster (`1.2.0-567-g22f2499`) and the following configuration:
```
[cloud/amazon-eu-west-1]
provider=ec2_boto
…
-
As of last week, when creating a new cluster, the scripts run a while, but then at some point the ssh connection does not work and the cluster fails.
The full log is attached as well as the config f…
-
Available at: http://slurm.schedmd.com/download.html
-
Job id 435266 on the test server ran for 13 hours without finishing. When I looked in the sandbox, it looked as if it had finished, but Slurm still had it in the running state. I ran `bpsh 1 scontrol …
-
Slurm cluster down or not allowing new users:
./create_slurm_account.sh test3
sacctmgr: error: slurmdbd: Sending DbdInit msg: Unspecified error
sacctmgr: error: Problem talking to the database: …
-
After rebuilding the cluster on CentOS 7, we found that some jobs were appearing with NODE_FAIL state, then getting restarted by Slurm on another node. More details are recorded on issue #569.
The pl…