cpnr / computing

0 stars 0 forks source link

Update squeue_summary.sh #23

Closed jhgoh closed 6 months ago

jhgoh commented 6 months ago

slurm job script에 argument가 포함되어 있는 경우 squeue_summary.sh 스크립트가 정상 동작하지 않는 것을 확인해 수정함.

문제 상황: 아래와 같은 식으로 run script뒤에 argument가 붙어있는 경우 awk에서 $6, $7이 각각 partition name과 hostname이 아닌 경우가 생김.

phortion RUNNING positron_5.2MeV 1:02:20 /users/phortion/test/run_sim.sh 1000 5.2 normal ho-oh
phortion RUNNING positron_5.2MeV 1:02:20 /users/phortion/test/run_sim.sh 1000 5.2 normal ho-oh
phortion RUNNING positron_5.2MeV 1:02:20 /users/phortion/test/run_sim.sh 1000 5.2 normal ho-oh
djkim3583 RUNNING p_1 51:54 ./pbs/posi_1.sh normal ho-oh
djkim3583 RUNNING p_2 51:54 ./pbs/posi_2.sh normal ho-oh
djkim3583 RUNNING p_3 51:51 ./pbs/posi_3.sh normal ho-oh

이 결과 아래와 같이 잘못 분류됨:

$ squeue_summary.sh 
=== Number of Jobs by User ===
   djkim3583: 200 total, 200 running,   0 pending,   0 hold,   0 completing
    phortion:  20 total,  20 running,   0 pending,   0 hold,   0 completing

=== Number of Jobs by Partition ===
        1000:  20 total,  20 running,   0 pending,   0 hold,   0 completing
      normal: 200 total, 200 running,   0 pending,   0 hold,   0 completing

=== Number of Jobs by Host ===
      raikou: 128 total, 128 running,   0 pending,   0 hold,   0 completing
     suicune:  38 total,  38 running,   0 pending,   0 hold,   0 completing
         4.2:  10 total,  10 running,   0 pending,   0 hold,   0 completing
       ho-oh:  34 total,  34 running,   0 pending,   0 hold,   0 completing
         5.2:  10 total,  10 running,   0 pending,   0 hold,   0 completing

문제 해결: awk의 field 번호를 $NF 변수를 이용해 오른쪽 끝에서부터 세도록 변경

Before:

  ++nJobsUser[$1];
  ++nJobsPart[$6];
  if ( !match($7, /\(.*\)/) )
    ++nJobsHost[$7];

after:

  ++nJobsUser[$1];
  ++nJobsPart[$(NF-1)];
  if ( !match($NF, /\(.*\)/) )
    ++nJobsHost[$NF];