NVIDIA / ngc-container-environment-modules

Environment modules for NGC containers
MIT License
29 stars 7 forks source link

Use try_load() instead of load() for Singularity module #2

Closed lgorenstein closed 3 years ago

lgorenstein commented 3 years ago

For sites that don't have a separate Singularity module, current modules will throw an error in

if not (isloaded("Singularity")) then
    load("Singularity")
end

(i.e. sites would have to manually comment out this snippet in every modulefile every time they grab an updated copy of this repository).

Doing a try_load("Singularity") would make it a soft fail instead, and the module would proceed quietly.

samcmill commented 3 years ago

Thanks for the feedback @lgorenstein.

Module behavior and naming is site specific. For the sites that do require loading a module to use Singularity and the module isn't named Singularity, try_load() would silently fail. I think it's better for it to error out in that case.

One option could be to check an environment variable. If the variable is defined, then skip load("Singularity"). Thoughts?

lgorenstein commented 3 years ago

Yes, I like the environment variable idea! The variable may then be either a boolean flag (NGCCEM_LOAD_SINGULARITY=0), or even a name of the module to load (NGCCEM_SINGULARITY_MODULE="myCustomSingularity/1.2.3" or empty to skip).

This way sites can define a core module ngc/default.lua module along the lines of

setenv("NGC_IMAGE_DIR", "/opt/ngc/images")
setenv("NGCCEM_SINGULARITY_MODULE", "")
if (mode() == "load" or mode() == "unload") then
        prepend_path("MODULEPATH", "/opt/ngc/modulefiles")
end

and have an easy way to load and customize things.

lgorenstein commented 3 years ago

Along the same vein, I was thinking a lot about ways to make one place for site-specific changes that could be effectively ingested/sourced by all NGC modules. So that half a year from now when I pull a new copy of this repo to re-deploy on our clusters, I would not have to re-edit all 41+ modulefiles :)

samcmill commented 3 years ago

This is my first pass at an environment variable approach. Thoughts?

local singularity_module = "Singularity"
if not (os.getenv("NGC_SINGULARITY_MODULE") == "none") then
        if (os.getenv("NGC_SINGULARITY_MODULE")) then
                singularity_module = os.getenv("NGC_SINGULARITY_MODULE")
        end
        if not (isloaded(singularity_module)) then
                load(singularity_module)
        end
end
lgorenstein commented 3 years ago

Yep, your snippet worked well for me. And here's my attempt to trim it:

local singularity_module = os.getenv("NGC_SINGULARITY_MODULE") or "Singularity"
if singularity_module ~= '' then
        if not (isloaded(singularity_module)) then
                load(singularity_module)
        end
end

Seems like a reasonable behavior:

NGC_SINGULARITY_MODULE singularity_module Action
unset "Singularity" gets loaded
set, non-empty "XXXXX" gets loaded
set but empty '' not loaded
lgorenstein commented 3 years ago

I personally would advocate an empty value as a more conventional way instead of 'none' value, but at the end both would solve the problem at hand.

samcmill commented 3 years ago

An empty value is ambiguous. To wit, it implies "use the default" to me, while to you it means "none". I'd rather be explicit.