flux-framework / flux-core

core services for the Flux resource management framework
GNU Lesser General Public License v3.0
167 stars 50 forks source link

Spectrum MPI personality does not support 10.4 #3855

Open SteVwonder opened 3 years ago

SteVwonder commented 3 years ago

In Spectrum 10.4, we need to use the ucx pml plugin as opposed to yalla (which is used in 10.3). So we need A) someway for the user to optionally specify which version of Spectrum they are running and B) someway to key off of that within the personality to change the behavior.

On the ☕ call today, @grondo suggested extending the -o mpi= syntax to support something like -o mpi=spectrum@10.4 or -o mpi=spectrum=10.4. The spectrum personality just needs to split based on the chosen delimiter (if it exists) and modify its behavior based on that version.

dongahn commented 3 years ago

I like the version augmented personality idea.

grondo commented 3 years ago

We could change the current test in MPI plugins to use if not shell.options.mpi:find("^spectrum") (I think this is the equivalent of startswith in Lua), e.g.

diff --git a/src/shell/lua.d/spectrum.lua b/src/shell/lua.d/spectrum.lua
index 78b21b524..c8f819a2a 100644
--- a/src/shell/lua.d/spectrum.lua
+++ b/src/shell/lua.d/spectrum.lua
@@ -8,9 +8,11 @@
 -- SPDX-License-Identifier: LGPL-3.0
 -------------------------------------------------------------

+
 -- Set environment specific to spectrum_mpi (derived from openmpi)
 --
-if shell.options.mpi ~= "spectrum" then return end
+
+if not string.find(shell.options.mpi or "", "^spectrum") then return end

 local posix = require 'posix'

Then to test for a specific version later, you can use an exact match, i.e. if shell.options.mpi == "spectrum@10.4"

At some point we may want a way to add some convenience functions to these Lua plugins, e.g. string:startswith and perhaps a way to get the version number and do <= or >= comparison, etc.

Edit: Updated patch to protect against shell.options.mpi not set.

grondo commented 3 years ago

Hm, one thing we could easily do to collect a "library" of functions for the shell lua plugins is to make a 00-lib.lua plugin. E.g. I pulled some of the reusable code out of spectrum.lua and added some functions that could be useful:


function string.startswith (str, start)
    if not str or str == "" then
        return false
    end
    return string.sub(str,1,string.len(start)) == start
end

function shell.get_mpi_option ()
    local mpi = shell.options.mpi
    if mpi then
        return mpi:match("^[^@]+"), mpi:match("@(.+)$")
    end
    return nil
end

function prepend_path (env_var, path)
    local val = shell.getenv (env_var)

    -- If path is already in env_var, do nothing. We stick ":" on both
    -- ends of the existing value so we can easily match exact paths
    -- instead of possibly matching substrings of paths when trying
    -- to match "zero or more" colons.
    --
    if val and ((":"..val..":"):match (":"..path..":")) then return end

    if val == nil then
       suffix = ''
    else
       suffix = ':'..val
    end
    shell.setenv (env_var, path..suffix)
end

function strip_env_by_prefix (env, prefix)
    --
    -- Have to call env:get() to translate env object to Lua table
    --  in order to use pairs() to iterate environment keys:
    --
    for k,v in pairs (env) do
        if k:match("^"..prefix) then
            shell.unsetenv (k)
        end
    end
end

Then the spectrum.lua plugin could be updated (in addition to removing the functions already defined in the library)


local mpi, version = shell.get_mpi_option()
if mpi ~= "spectrum" then return end
$ flux mini run -o mpi=spectrum@10.4 hostname
0.053s: flux-shell[0]: stderr: mpi=spectrum, version=10.4
grondo commented 3 years ago

Or maybe to be more generalized the function should be shell.getopt_with_version (name)?