Duplicate jobs in 'active jobs'.

We notice that in the Jobs -> Active jobs tab there are duplicate jobs per cluster as both have the same slurm configuration and slurm is configured with a single cluster:

$ _cpu1r
$ sacctmgr show cluster
   Cluster     ControlHost  ControlPort   RPC     Share GrpJobs       GrpTRES GrpSubmit MaxJobs       MaxTRES MaxSubmit     MaxWall                  QOS   Def QOS 
---------- --------------- ------------ ----- --------- ------- ------------- --------- ------- ------------- --------- ----------- -------------------- --------- 
     teide      10.0.22.24         6817 10240         1                                                                                           normal

and in ondemand we can verify the cluster configurations:

$ ssh root@ondemand.hpc.iter.es
$ cd /etc/ood/config/clusters.d
$ cat anaga.yml

---
v2:
  metadata:
    title: "Anaga"
  login:
    host: "10.5.22.101"
  job:
    adapter: "slurm"
    cluster: "teide"
    bin: "/usr/bin"
    conf: "/etc/slurm/slurm.conf"
    #bin_overrides:
      # sbatch: "/usr/local/bin/sbatch"
      # squeue: "/usr/bin/squeue"
      # scontrol: "/usr/bin/scontrol"
      # scancel: ""
    copy_enviornment: false
    partitions: ["gpu"]
  batch_connect:
    basic:
      script_wrapper: |
        ml purge
        %s
      set_host: "host=$(hostname -A | awk '{print $1}')"
    vnc:
      script_wrapper: |
        ml purge
        ml load TurboVNC
        #export PATH="/usr/local/turbovnc/bin:$PATH"
        #export WEBSOCKIFY_CMD="/usr/local/websockify/run"
        %s
      set_host: "host=$(hostname -A | awk '{print $1}')"

$ cat teide.yml

---
v2:
  metadata:
    title: "Teide"
  login:
    host: "10.5.22.100"
  job:
    adapter: "slurm"
    cluster: "teide"
    bin: "/usr/bin"
    conf: "/etc/slurm/slurm.conf"
    #bin_overrides:
      # sbatch: "/usr/local/bin/sbatch"
      # squeue: "/usr/bin/squeue"
      # scontrol: "/usr/bin/scontrol"
      # scancel: ""
    copy_enviornment: false
  batch_connect:
    basic:
      script_wrapper: |
        ml purge
        %s
      set_host: "host=$(hostname -A | awk '{print $1}')"
    vnc:
      script_wrapper: |
        ml purge
        ml load TurboVNC
        #export PATH="/usr/local/turbovnc/bin:$PATH"
        #export WEBSOCKIFY_CMD="/usr/local/websockify/run"
        %s
      set_host: "host=$(hostname -A | awk '{print $1}')"

According to the following thread https://discourse.openondemand.org/t/configure-partitions-as-clusters/701/2 we can try to create an "initialiser" to filter the jobs.

The best would be to filter the jobs by "partition" and assign the "anaga" cluster when the jobs are in the "gpu" partition and assign them to the "teide" cluster in the rest of the cases.

$ _cpu1r
$ scontrol show partition | grep PartitionName
PartitionName=main
PartitionName=batch
PartitionName=express
PartitionName=long
PartitionName=gpu
PartitionName=fatnodes
PartitionName=ondemand

Any help is welcome in order to correctly view the jobs associated to each virtual cluster having a single cluster configured in Slurm. thanks in advance))

OSC / ondemand

Duplicate jobs in 'active jobs'. #3668