OSC / osc-machete

High level interface to submitting and checking the status of batch jobs (currently OSC specific)
MIT License
1 stars 0 forks source link

Investigate support for job arrays #17

Open ericfranz opened 10 years ago

ericfranz commented 10 years ago

qsub offers a job array option to submit multiple jobs with a single qsub on a single script. The jobs are parameterized using the PBS_ARRAYID. What is interesting about this is that the job array has a single job id.

For certain problem sets this could be very useful, especially, for example, in the case of a multi-job workflow similar to:

image

By using qsub's job array for the middle stage (parameter sweep) of the workflow, from qsub's perspective, that parameter sweep is treated as one job group with one job id, which means we can turn the above into this:

image

This is much easier to setup job dependencies for. This could be another option.

http://docs.adaptivecomputing.com/torque/4-1-3/Content/topics/2-jobs/multiJobSubmission.htm

dhudak-osc commented 10 years ago

Please check with Judy regarding job arrays. I think she had some issues using them in the past. I am in favor of using them, but want to make sure we have all the facts.

Thanks, Dave On Sep 3, 2014, at 10:47 AM, Eric notifications@github.com<mailto:notifications@github.com> wrote:

qsub offers a job array option to submit multiple jobs with a single qsub on a single script. The jobs are parameterized using the PBS_ARRAYID. What is interesting about this is that the job array has a single job id.

For certain problem sets this could be very useful, especially, for example, in the case of a multi-job workflow similar to:

[image]https://cloud.githubusercontent.com/assets/512333/4135739/e78ac432-3378-11e4-8705-fdd673942fd7.png

By using qsub's job array for the middle stage (parameter sweep) of the workflow, from qsub's perspective, that parameter sweep is treated as one job group with one job id, which means we can turn the above into this:

[image]https://cloud.githubusercontent.com/assets/512333/4135768/28b15f2a-3379-11e4-8d9a-d497284f51d3.png

This is much easier to setup job dependencies for. This could be another option.

http://docs.adaptivecomputing.com/torque/4-1-3/Content/topics/2-jobs/multiJobSubmission.htm

— Reply to this email directly or view it on GitHubhttps://github.com/AweSim-OSC/osc-machete/issues/17.


David E. Hudak, Ph.D. dhudak@osc.edumailto:dhudak@osc.edu Interim Director of Supercomputer Services Ohio Supercomputer Center http://www.osc.edu OSC is a member of the OH-TECH Consortium http://www.oh-tech.org

ericfranz commented 10 years ago

Note: using job arrays is a refactoring question. Whether or not we use job arrays to implement parameter sweeps should not change the interface the developer uses to build these. See https://github.com/AweSim-OSC/osc-machete/issues/11 for this discussion.