easybuilders / easybuild-framework

EasyBuild is a software installation framework in Python that allows you to install software in a structured and robust way.
https://easybuild.io
GNU General Public License v2.0
148 stars 202 forks source link

support alternative naming schemes for modules #173

Closed boegel closed 9 years ago

boegel commented 12 years ago

(old internal ticket 284, 324)

We should allow a way to specify a custom naming scheme for modules, instead of the default that EasyBuild uses now, i.e. name-prefix-version-toolkit-suffix.

This is badly needed for most big HPC sites.

Notes, as discussed during the 1st EasyBuild hackathon:

The list below gives an overview of what this custom naming scheme should support (compiled by @fgeorgatos):

  ---------------------------------------------------------- --------------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Example name                                               WHO                   Reference URL & Comments
  ---------------------------------------------------------- --------------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  netcdf/3.6.2                                               TACC                  [[http://services.tacc.utexas.edu/index.php/using-modules][]] \\ Lmod: automatic reloading of an entire module hierarchy when a single module anywhere in the hierarchy is changed.
  hpc/netcdf-3.6.3_intel-11.0.083                           HARVARD               [[http://rc.fas.harvard.edu/faq/modulelist][]] \\ very long and complete list
  misc-libs/netcdf/3.6.3_intel                              CLUMEQ                [[https://www.clumeq.ca/wiki/index.php/ModulesDisponiblesSurColosse][]] \\ nested layout
  netcdf/intel/64/4.0                                        SARA                  [[http://www.sara.nl/systems/shared/modules/][]] \\ [[http://www.sara.nl/systems/lisa/news/modules-ng][]] \\  [[http://www.sara.nl/systems/lisa/news/amd64-phase1][]] \\  [[http://www.sara.nl/systems/lisa/news/Modules-deprecated-2011-04-04][]]
[[https://www.clumeq.ca/wiki/index.php/ModulesDisponiblesSurColosse][]] \\ nested layout
  netcdf/gnu/64/4.0                                          SARA                  [[http://www.sara.nl/systems/shared/modules/][]] \\ [[http://www.sara.nl/systems/lisa/news/modules-ng][]] \\  [[http://www.sara.nl/systems/lisa/news/amd64-phase1][]] \\  [[http://www.sara.nl/systems/lisa/news/Modules-deprecated-2011-04-04][]]
  netcdf/4.0.1                                               UIBK                  [[http://www.uibk.ac.at/zid/systeme/hpc-systeme/common/tutorials/modules-howto.html][]] \\ PREFERRED_MC
  netcdf/X.Y.Z/gnu-4.1.2                                     UIBK, too             [[http://www.uibk.ac.at/th-physik/howto/hpc/modules.html][]]
  netcdf/X.Y.Z-gnu                                           CSC                   [[http://www.csc.fi/english/pages/hippu_guide/using_hippu/modules/index_html][]]
  ofed/qlogic/gcc/64/1.2.7                                   UCL                   [[http://www.ucl.ac.uk/isd/common/research-computing/services/legion-upgrade/userguide/userenvironment][]]
  netcdf/3.6.2                                               MIT                   [[http://www.darwinproject.mit.edu/wiki/index.php/Compute_cluster_hardware/software_overview][]]
  netcdf/X.Y.Z                                               MIT, too              [[http://coyote.mit.edu/mediawiki/index.php/Modules][]]
  netcdf/3.6.2                                               CAM                   [[http://www.hpc.cam.ac.uk/user/software.html][]]
  netcdf/4.0.1_nc3                                          UTORONTO              [[https://support.scinet.utoronto.ca/wiki/index.php/Software_and_Libraries][]]
  blas/?                                                     VLSCI@AU              [[http://www.vlsci.org.au/documentation/software-applications][]]
  fluent/?                                                   EPFL                  [[http://pleiades.epfl.ch/index.php?option=com_content&task=view&id=15&Itemid=30][]]
  netcdf/4.0.0.3.1jg64                                       CSCS                  [[http://user.cscs.ch/software_and_programming_environment/compilers_and_programming/rosa_cray_xt5/modules_framework/index.html][]]
  netcdf/4.1.1/pgi/10.3/64                                   TAMU                  [[http://brazos.tamu.edu/software/modules.html][]]
  sles10.1_gnu4.1.2_shared                                 UTENNESSEE            [[http://www.nics.tennessee.edu/computing-resources/kraken/software][]]
  netcdf/4.0-pgi, /4.0-gcc , /4.1.1                          UMICH                 [[http://cac.engin.umich.edu/resources/software/][]]
  netcdf-4.0.1                                               ARSC                  [[http://www.arsc.edu/support/news/systemnews/][]]
  NetCDF/4.1.3-gnu                                           Griffith University   [[http://confluence.rcs.griffith.edu.au:8080/display/GHPC/netcdf][]]
  HMMER                                                      iCER/HPCC             [[https://wiki.hpcc.msu.edu/display/Bioinfo/Module+Files][]]
  HMMER                                                      UPPNEX                [[https://www.uppnex.uu.se/installed-software][]]
  uberftp-client-2.6                                         NCSA                  [[http://www.ncsa.illinois.edu/UserInfo/Resources/Hardware/CommonDoc/module.html][]]
  netcdf                                                     LRZ                   [[http://www.lrz.de/services/software/utilities/modules/deisa_details.html][]] \\ Ref. on DEISA setup: [[http://www.lrz.de/services/software/utilities/modules/deisa_details.html][]]
  globus                                                     PRACE sites           [[http://www.prace-ri.eu/Interactive-Access-to-HPC][]]
  globus/5.0.4 & GLOBUS-5.0                                  TACC                  [[http://www.underworldproject.org/BuildRecipes/ranger.rst][]] See Teragrid below
  netcdf/4.1.2-gnu, /4.1.2-intel, /3.6.3-gnu, /3.6.3-intel   cyi/euclid@ls2        [[http://eniac.cyi.ac.cy/display/UserDoc/HPC+Baseline+Configuration]]
  netcdf/3.6.3-gcc, netcdf-4/4.0.1-gcclf                     cyi/planck   [[http://eniac.cyi.ac.cy/display/UserDoc/HPC+Baseline+Configuration]]         
  netcdf/3.6.2-intel, /3.6.2-gnu                             ba@ls2           [[http://eniac.cyi.ac.cy/display/UserDoc/HPC+Baseline+Configuration]]     
  netcdf/4.1.3-gcc(default)                                  cytera@ls2       [[http://eniac.cyi.ac.cy/display/UserDoc/HPC+Baseline+Configuration]]     
  netcdf/1.4.3-mpich2-intel-64bit                                  FZK       [[http://www.google.com]]     
  ---------------------------------------------------------- --------------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
fgeorgatos commented 12 years ago

as regards this one, I can see 3 possible directions: 1) cultivate a symlink farm - yet, consider dependency resolution issues inside the modulefiles 2) use aliases - ie. dependent modules would come up with the easybuild namespace in their version strings 3) define a mapping function - this requires perfect unambiguous 1-1 mapping for proper functionality

I find that the 1st one is trivial to implement -without the deps- outside eb mechanisms; while with the deps, it will result in complex modulefiles. So, not too attractive.

The 2nd is kind of middleground to allow some pretty good compatibility with sites making the transition from legacy module namespace schemes to easybuild (which is far more rigorous)

The 3rd one is a more generic approach and should cover multiple scenarios, in effect putting the responsibility on the sysadmins to provide a consistent namespace.

The mapping function could also include retrieving the info from a "module mapping file", which allows infinite freedom and the possibility to open any can of worms you fancy...

fgeorgatos commented 11 years ago

Hi there,

after pondering a bit more on this one, here's some summary of what I think is the big picture:

category1/category2/.../categoryN/package/version/spec1/spec2/.../specN

Because we cannot have it all before v1.0 -due to time needed to implement the "best" solution-, one tractable approach to follow could perhaps be a "mapping" file (ie. a table of 2 columns) This becomes a python dictionary to assist with the production of the alternative namespace. (but not sure if this should rather be invoked as a separate secondary step)

Hopefully that covers the following scenarios relatively easily (read: before v1.0), and until we understand what the perfect solution is really about:

module load netcdf/intel/64/4.0 
module load netcdf/4.1.1/pgi/10.3/64 
module load netcdf/1.4.3-mpich2-intel-64bit      # doable right now with suffix?
module load misc-libs/netcdf/3.6.3_intel             # doable right now with suffix & moduleclass?
module load package/version/arch/fpaccuracy

In the ideal (long-term) situation, the full flexibility of modules should be available, with multiple categories being possible, for example:

bioinformatics/tools/MrBayes/3.2/intel/12.0/64

I have to admit I cannot easily grasp all the potential different specializations of environment modules configuration, therefor I suggest to go for now with the file-mapping approach described above, which should be the most future-proof and saves us from bothering with all the fancy (and often crippled) namespace schemes.

In essence, the proposed idea is option (3) of the previous email in this thread.

Fotis

itkovian commented 11 years ago

Considering my remark on #346, I do not think that the splitting should be done in the easyconfigs. That would mean each site with a different naming scheme would have to change all the easyconfigs they want. I hardly see the problem with offering a class that splits the module name into some format easybuild can handle and allow sites to offer a subclass that changes this to their needs.

boegel commented 11 years ago

No, not in the easyconfigs, that would be madness.

In the EasyBuild configuration file, where you also specify things like log_format and install_path, see https://github.com/hpcugent/easybuild/wiki/Configuration.

My current plan is to support a function like construct_module_name or something, that a user can define. That function would then take an EasyConfig instance, which has all the possible info, and can spit out a string/list/dictionary (not sure yet), that defines the corresponding module name. If such a function is defined, EasyBuild would use it when generating modules.

The current default would then be equivalent with something like (untested):

def construct_module_name(ec):
    return [ec.name, ec.get_installversion()]

You'd probably also need a function that does the reverse, i.e. deconstruct a given module name in parts and make sense of them.

fgeorgatos commented 10 years ago

shouldn't we close this one now that v1.8.0 is out?

boegel commented 10 years ago

I'll consider this fully fixed one #687 is handled, so let's leave it open for now.

boegel commented 9 years ago

687 is fixed, so closing this