CoBrALab / qbatch

The Unlicense
27 stars 13 forks source link

Add transformation of memory input parameters #145

Open gdevenyi opened 6 years ago

gdevenyi commented 6 years ago

As per https://github.com/Mouse-Imaging-Centre/pydpiper/issues/350

gdevenyi commented 6 years ago

Slurm:

Specify the real memory required per node. Default units are megabytes unless the 
SchedulerParameters configuration parameter includes the "default_gbytes" option 
for gigabytes. Different units can be specified using the suffix [K|M|G|T]. Default 
value is DefMemPerNode and the maximum value is MaxMemPerNode. If configured, both 
parameters can be seen using the scontrol show config command. This parameter would 
generally be used if whole nodes are allocated to jobs (SelectType=select/linear). 
Also see --mem-per-cpu. --mem and --mem-per-cpu are mutually exclusive.

SGE:

Memory specifiers are positive decimal, hexadecimal or octal integer constants 
which may be followed by a multiplier letter. Valid multiplier letters are k, K, m,
 M, g, G, t, and T, where k means multiply the value by 1000, K multiply by 1024, m 
multiply by 1000×1000, M  multiply  by  1024×1024,  g multiply by 1000×1000×1000, G 
multiply by 1024×1024×1024, t multiply by 1000×1000×1000×1000, and T multiply by 
1024×1024×1024×1024.  If no multiplier is present, the value is just counted in 
bytes.  Whether memory values above the 32-bit limit are representable on 32-bit 
systems, even for disk space, is system-dependent.

PBS

       size    specifies the maximum amount in terms of bytes or words.  It is expressed in the form integer[suffix] The suffix is a multiplier defined in the following table, The size of a word is the word size on the execution host.

                b or  w    bytes or words.

               kb or kw    Kilo (1024) bytes or words.

               mb or mw    Mega (1,048,576) bytes or words.

               gb or gw    Giga (1,073,741,824) bytes or words.

LSF

Scaling the units for resource usage limits
The default unit for the following resource usage limits is KB:

Core limit (-C and CORELIMIT)
Memory limit (-M and MEMLIMIT)
Stack limit (-S and STACKLIMIT)
Swap limit (-v and SWAPLIMIT)
This default may be too small for some environments that make use of very large resource usage limits, for example, GB or TB.

LSF_UNIT_FOR_LIMITS in lsf.conf specifies larger units for the resource usage limits with default unit of KB.

The unit for the resource usage limit can be one of:

KB (kilobytes)
MB (megabytes)
GB (gigabytes)
TB (terabytes)
PB (petabytes)
EB (exabytes)
LSF_UNIT_FOR_LIMITS applies cluster-wide to limts at the job-level (bsub), queue-level (lsb.queues), and application level (lsb.applications).
gdevenyi commented 6 years ago

Lovely inconsistencies, especially Slurm which makes it dependent on a configuration parameter.

gdevenyi commented 6 years ago

Method:

1) Detect any suffix on memory input 2) If no suffix PBS and SGE are okay, print warning to stderr for Slurm about config dependent number 2) Check if input is valid for parsing 3) Depending on cluster type, munge memory specification into acceptable format, warn to stderr if done so.

bcdarwin commented 6 years ago

For Slurm, why not just explicitly set the suffix?

bcdarwin commented 6 years ago

I guess you're talking about inputs ... do you prefer specifications with no suffix to behave consistently across clusters or to mimic the local cluster?

gdevenyi commented 6 years ago

@bcdarwin good question.

Given the design features of qbatch it makes more sense to enforce a units specification and then transform it to the cluster specification, therefore:

  1. Check for usable suffix
  2. Depending on cluster type, munge memory specification to acceptable format.
gdevenyi commented 6 years ago

Usable suffixes

k,m,g,t,p K,M,G,T,P with or without B.

Always assume 1024 power type (for SGE case)

gdevenyi commented 5 years ago

Doing some dev work and reading this thread over again, I'd make one adjustment. I think I would make qbatch always assume GB units if no units are specified, to provide a uniform interface.

gdevenyi commented 5 years ago

Update to pseudocode:

  1. Assume no-units specified on the command-line is in GB
  2. Check for acceptable suffixes [k,m,g,t,p,K,M,G,T,P] with or without B
  3. Given QBATCH_SYSTEM, transform into job submission with correct suffix for SYSTEM, add explicit suffix for when suffix is not provided for consistency across platforms.
pipitone commented 5 years ago

I know it adds a dependency, but could consider using something like the module humanfriendly to parse units.

gdevenyi commented 5 years ago

hrm an interesting idea @pipitone. In that case we would be actually parsing the number and then recasting it. Sadly I think this adds even more complexity, as SGE can't accept ISO style proper capitalization + B that humanfriendly.format_size supports. In addition, humanfriendly also space-pads the output. I think direct munging the strings with a regex is probably a cleaner thing to do.

pipitone commented 5 years ago

SGE can't accept ISO style proper capitalization

Colour me surprised ;-)

Perhaps might save some effort on the parsing side to use an external library, no? And then save the regexs for converting the output from ISO to an SGE-friendly format.

I guess it’s that eternal battle between rolling your own vs adding an external dependency.

On Tue, Jul 30, 2019, at 14:53, Gabriel A. Devenyi wrote:

hrm an interesting idea @pipitone https://github.com/pipitone. In that case we would be actually parsing the number and then recasting it. Sadly I think this adds even more complexity, as SGE can't accept ISO style proper capitalization + B that humanfriendly.format_size supports. In addition, humanfriendly also space-pads the output. I think direct munging the strings with a regex is probably a cleaner thing to do.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pipitone/qbatch/issues/145?email_source=notifications&email_token=AAD4LRVC6KJYJ7RJ53NIBR3QCCE33A5CNFSM4EGJSMXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3E6LSQ#issuecomment-516548042, or mute the thread https://github.com/notifications/unsubscribe-auth/AAD4LRT4X7CRSQO3XEBFVXDQCCE33ANCNFSM4EGJSMXA.