TACC / launcher

A simple utility for executing multiple sequential or multi-threaded applications in a single multi-node batch job
MIT License
63 stars 33 forks source link

Using multiple nodes runs extra jobs #16

Closed schristley closed 7 years ago

schristley commented 7 years ago

I have job with 4 nodes and 12 processes per node, but only 26 commands in the joblist file. Launcher seems to fill out the total processes (48 in this case) by running the last command multiple times.

------------- SUMMARY --------------- Number of hosts: 4 Working directory: /scratch/01114/vdj/vdj/job-524787072520547865-242ac11a-0001-007-repcalc_bcr_test Processes per host: 12 Total processes: 48 Total jobs: 26 Scheduling method: interleaved


Launcher: Starting parallel tasks... Launcher: Task 7 running job 8 on c548-704.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.26.igblast_db-pass.tab Ig human) Launcher: Task 3 running job 4 on c548-704.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.03.igblast_db-pass.tab Ig human) Launcher: Task 4 running job 5 on c548-704.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.25.igblast_db-pass.tab Ig human) Launcher: Task 9 running job 10 on c548-704.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" & & bash ./do_clones.sh B1.19.igblast_db-pass.tab Ig human) Launcher: Task 2 running job 3 on c548-704.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.20.igblast_db-pass.tab Ig human) Launcher: Task 11 running job 12 on c548-704.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.27.igblast_db-pass.tab Ig human) Launcher: Task 10 running job 11 on c548-704.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.28.igblast_db-pass.tab Ig human) Launcher: Task 5 running job 6 on c548-704.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.21.igblast_db-pass.tab Ig human) Launcher: Task 8 running job 9 on c548-704.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.15.igblast_db-pass.tab Ig human) Launcher: Task 6 running job 7 on c548-704.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.17.igblast_db-pass.tab Ig human) Launcher: Task 1 running job 2 on c548-704.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.22.igblast_db-pass.tab Ig human) Launcher: Task 0 running job 1 on c548-704.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.23.igblast_db-pass.tab Ig human) Launcher: Task 22 running job 23 on c548-801.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.16.igblast_db-pass.tab Ig human) Launcher: Task 13 running job 14 on c548-801.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.05.igblast_db-pass.tab Ig human) Launcher: Task 12 running job 13 on c548-801.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.02.igblast_db-pass.tab Ig human) Launcher: Task 21 running job 22 on c548-801.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.13.igblast_db-pass.tab Ig human) Launcher: Task 23 running job 24 on c548-801.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.10.igblast_db-pass.tab Ig human) Launcher: Task 14 running job 15 on c548-801.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.08.igblast_db-pass.tab Ig human) Launcher: Task 20 running job 21 on c548-801.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.12.igblast_db-pass.tab Ig human) Launcher: Task 17 running job 18 on c548-801.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.07.igblast_db-pass.tab Ig human) Launcher: Task 18 running job 19 on c548-801.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.31.igblast_db-pass.tab Ig human) Launcher: Task 15 running job 16 on c548-801.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.09.igblast_db-pass.tab Ig human) Launcher: Task 16 running job 17 on c548-801.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.06.igblast_db-pass.tab Ig human) Launcher: Task 19 running job 20 on c548-801.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.30.igblast_db-pass.tab Ig human) Launcher: Task 28 running job 29 on c548-804.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.01.igblast_db-pass.tab Ig human) Launcher: Task 26 running job 27 on c548-804.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.01.igblast_db-pass.tab Ig human) Launcher: Task 27 running job 28 on c548-804.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.01.igblast_db-pass.tab Ig human) Launcher: Task 34 running job 35 on c548-804.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.01.igblast_db-pass.tab Ig human) Launcher: Task 29 running job 30 on c548-804.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.01.igblast_db-pass.tab Ig human) Launcher: Task 30 running job 31 on c548-804.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.01.igblast_db-pass.tab Ig human) Launcher: Task 33 running job 34 on c548-804.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.01.igblast_db-pass.tab Ig human) Launcher: Task 25 running job 26 on c548-804.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.01.igblast_db-pass.tab Ig human) Launcher: Task 24 running job 25 on c548-804.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.11.igblast_db-pass.tab Ig human) Launcher: Task 35 running job 36 on c548-804.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.01.igblast_db-pass.tab Ig human) Launcher: Task 32 running job 33 on c548-804.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.01.igblast_db-pass.tab Ig human) Launcher: Task 31 running job 32 on c548-804.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.01.igblast_db-pass.tab Ig human) Launcher: Task 37 running job 38 on c548-901.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.01.igblast_db-pass.tab Ig human) Launcher: Task 36 running job 37 on c548-901.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.01.igblast_db-pass.tab Ig human) Launcher: Task 47 running job 48 on c548-901.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.01.igblast_db-pass.tab Ig human) Launcher: Task 39 running job 40 on c548-901.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.01.igblast_db-pass.tab Ig human) Launcher: Task 42 running job 43 on c548-901.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/production/R_libs" && bash ./do_clones.sh B1.01.igblast_db-pass.tab Ig human) Launcher: Task 45 running job 46 on c548-901.stampede.tacc.utexas.edu (export VDJ_DB_ROOT="/work/01114/vdj/common/igblast-db/db/10_05_2016/" && export R_LIBS="/work/01114/vdj/stampede/producti

lwilson commented 7 years ago

I just pushed a fix for this (https://github.com/TACC/launcher/commit/ddd9d284225d23234ec2112f0d311e1849500825). Try doing a pull and let me know if the problem is still occurring.

schristley commented 7 years ago

yep, seems to work!