CoBrALab / qbatch

The Unlicense
27 stars 13 forks source link

Handle walltime units uniformly. #209

Open gdevenyi opened 4 years ago

gdevenyi commented 4 years ago

SLURM decided to be brain-dead and default to using minutes for unformatted numerical --time option, instead of seconds, like SGE/PBS/Torque.

This means we need to be explicit about the supported input formats for walltime.

Torque says: http://docs.adaptivecomputing.com/torque/3-0-5/2.1jobsubmission.php

walltime    seconds, or [[HH:]MM:]SS    Maximum amount of real time during which the job can be in the running state.

SGE says: https://linux.die.net/man/1/sge_types

time_specifier

A time specifier either consists of a positive decimal, hexadecimal or octal integer constant, in which case the value is interpreted to be in seconds, or is built by 3 decimal integer numbers separated by colon signs where the first number counts the hours, the second the minutes and the third the seconds. If a number would be zero it can be left out but the separating colon must remain (e.g. 1:0:1 = 1::1 means 1 hours and 1 second).

SLURM says: https://slurm.schedmd.com/sbatch.html

-t, --time=<time>
Set a limit on the total run time of the job allocation. If the requested time limit exceeds the partition's time limit, the job will be left in a PENDING state (possibly indefinitely). The default time limit is the partition's default time limit. When the time limit is reached, each task in each job step is sent SIGTERM followed by SIGKILL. The interval between signals is specified by the Slurm configuration parameter KillWait. The OverTimeLimit configuration parameter may permit the job to run longer than scheduled. Time resolution is one minute and second values are rounded up to the next minute.
A time limit of zero requests that no time limit be imposed. Acceptable time formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds".
gdevenyi commented 4 years ago

Proposed solution

  1. Take input, check if it is valid format for specified job submission system, if so pass it through
  2. Also support "h", "m", "s" individual suffixes for numbers
  3. If a flat number is provided, assume seconds, convert for slurm to minutes, print warning if slurm, add more explicit documentation for this.