papaemmelab / toil_container

:whale: Toil + Docker and Singularity.
MIT License
9 stars 6 forks source link

Reduce queries to bparams and lsfadmin #33

Closed juanesarango closed 2 years ago

juanesarango commented 2 years ago

Problem: too much unnecessary queries to lsf

To specify the memory requirements for every job, toil.lsf gets the default lsf units from the lsf configuration. By doing a query to find the units falling back to bparams, lsfadmin and lsf config files. And this is done twice per job for -R select[mem>{memory}] and -R rusage[mem={memory}]. Additionally, tries to get the per core reservation configuration by the same queries: to bparams, lsfadmin and lsf config files.

Basically, for every job, there's a minimum of 3 queries to bparams and more if the setting is not found there but in lsfadmin or a config file. So if we have 4K jobs running, we're doing at least 12K queries to lsf to get the default lsf units and the per core reservation config, which is the same for every job in the cluster.

Solution: add env for per core config and use bytes as default lsf units

This PR allows to pass TOIL_CONTAINER_PER_CORE with values Y/N to define if LSF total memory is define per job or per core, thus avoiding any lsf query for getting this info. And it removes the dynamic query to lsf units. It assume always that if an integer is passed to memory=<int> then the units are bytes. If other unit is desired, it can be specified as a string. ie. memory="8Gb".

codecov[bot] commented 2 years ago

Codecov Report

Merging #33 (ddec4ca) into master (10249b0) will increase coverage by 0.40%. The diff coverage is 98.36%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #33      +/-   ##
==========================================
+ Coverage   95.41%   95.82%   +0.40%     
==========================================
  Files           7        8       +1     
  Lines         349      407      +58     
==========================================
+ Hits          333      390      +57     
- Misses         16       17       +1     
Impacted Files Coverage Ξ”
toil_container/lsf_helper.py 98.24% <98.24%> (ΓΈ)
toil_container/jobs.py 95.71% <100.00%> (+0.06%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Ξ” = absolute <relative> (impact), ΓΈ = not affected, ? = missing data Powered by Codecov. Last update 10249b0...ddec4ca. Read the comment docs.

ddomenico commented 2 years ago

Looks good πŸ‘