flux-framework / flux-pmix

flux shell plugin to bootstrap openmpi v5+
GNU Lesser General Public License v3.0
2 stars 4 forks source link

Spack Build Error for pmix TASKMAP_ENCODE_RAW_DERANGED undeclared #83

Closed vsoch closed 1 year ago

vsoch commented 1 year ago

Hiya! Were there any recent changes that might cause this error?

 main.c:126:42: error: 'TASKMAP_ENCODE_RAW_DERANGED' undeclared (fir
            st use in this function); did you mean 'TASKMAP_ENCODE_WRAPPED'?
     171      126 |     if (!(raw = taskmap_encode (taskmap, TASKMAP_ENCODE_RAW
            _DERANGED)))
     172          |                                          ^~~~~~~~~~~~~~~~~~
            ~~~~~~~~~
     173          |                                          TASKMAP_ENCODE_WRA
            PPED

Seen here: https://github.com/flux-framework/spack/actions/runs/4624820726/jobs/8180020114#step:28:674

Sometimes the build caches in actions get old, so I can try clearing and running again, but I also know there was a recent release so there could be some issue with the new build. I'll try a rebuild with a refreshed cache and see if I hit the issue (and update here).

And geez - poor task map. Is he really deranged? Does he just need someone to talk to? If he needs a friend I can offer. :wink:

grondo commented 1 year ago

That deranged flag wasn't added until flux-core v0.47.0, so we have to make sure flux-pmix is built from at least that version of flux-core, if not v0.48.0 (though I think this is only required for make check)

garlick commented 1 year ago

Right but also pmix 0.3.0 needs core 0.49.0 at runtime since the pmix shell plugin can't be enabled without the changes to -o pmi that just went in.

vsoch commented 1 year ago

Ok - so order of operations it sounds like we need https://github.com/spack/spack/pull/36676 merged (adding 0.49.0 to spack) and then for the pmix recipe, to add a constraint for the new version that it needs at least that. Then the build should work (and we can update the flux-pmix recipe in spack, which is the build currently failing. The automation works to test the build of the new version before trying to update the recipe for PR to spack).

garlick commented 1 year ago

Sounds right, sorry for the hickup.

I just realized I didn't raise the minimum flux-core version required by configure in the flux-pmix build system. It thinks 0.46.0 is good enough. If I'd have done that right, the requirement would have been more obvious. Will fix for the next release.

vsoch commented 1 year ago

okay build is testing here: https://github.com/flux-framework/spack/pull/88.

I'm assuming that spack (when there is a when=<version> also applies that install via a branch, but if not, we should add a directive for that.

vsoch commented 1 year ago

Is this something new too?

==> flux-pmix: Executing phase: 'install'
==> Error: FileNotFoundError: [Errno 2] No such file or directory: '/opt/spack/opt/spack/linux-ubuntu20.04-x86_64_v4/gcc-10.3.0/flux-pmix-0.3.0-difj763luryeyvhrwrdsl46luvweuyv7/etc/flux/shell/lua.d/mpi/openmpi@5.lua'

/home/runner/work/spack/spack/packages/flux-pmix/package.py:43, in add_pluginpath:
         41    def add_pluginpath(self):
         42        rcfile = join_path(self.prefix.etc, "flux/shell/lua.d/mpi/openmpi@5.lua")
  >>     43        filter_file(
         44            r"pmix/pmix.so", join_path(self.prefix.lib, "flux/shell/plugins/pmix/pmix.so"), rcfile
         45        )

See build log for details:
  /tmp/runner/spack-stage/spack-stage-flux-pmix-0.3.0-difj763luryeyvhrwrdsl46luvweuyv7/spack-build-out.txt
Traceback (most recent call last):
  File "/home/runner/.local/bin/pakages", line 8, in <module>
    sys.exit(run_main())
  File "/home/runner/.local/lib/python3.8/site-packages/pakages/cli/__init__.py", line 243, in run_main
    main(args=args, parser=parser, extra=extra, subparser=helper)
  File "/home/runner/.local/lib/python3.8/site-packages/pakages/cli/install.py", line 13, in main
    cli.install(args.packages, registry=args.registry, tag=args.tag, **kwargs)
  File "/home/runner/.local/lib/python3.8/site-packages/pakages/builders/spack/client.py", line 158, in install
    for line in pakages.utils.stream_command(command):
  File "/home/runner/.local/lib/python3.8/site-packages/pakages/utils/terminal.py", line 103, in stream_command
    raise subprocess.CalledProcessError(return_code, cmd, error)
subprocess.CalledProcessError: Command '['spack', 'install', 'flux-pmix']' returned non-zero exit status 1.

The path looks a little weird - not sure why there is an AT in it!

garlick commented 1 year ago

That file openmpi@5.lua (yup that was its name) was dropped from this release. Maybe the spack package is still referencing it?

vsoch commented 1 year ago

Yes indeed! So should both of these sections be conditional on less than 0.3.0?

    @run_after("install")
    def add_pluginpath(self):
        rcfile = join_path(self.prefix.etc, "flux/shell/lua.d/mpi/openmpi@5.lua")
        filter_file(
            r"pmix/pmix.so", join_path(self.prefix.lib, "flux/shell/plugins/pmix/pmix.so"), rcfile
        )

    def setup_run_environment(self, env):
        env.prepend_path("FLUX_SHELL_RC_PATH", join_path(self.prefix, "etc/flux/shell/lua.d"))
garlick commented 1 year ago

Yes, except :facepalm: I got rid of the lua script and now I don't think there is any way to extend flux-shell(1)'s plugin search path so it can find the pmix.so plugin on its own when it's spacked off the beaten path. @grondo can you confirm? Maybe we'll need to provide another lua.d script that just loads the plugin unconditionally and leave the FLUX_SHELL_RC_PATH as above so that script can be found.

We also should prepend prefix.lib + "flux/upmi/plugins" to the FLUX_PMI_CLIENT_SEARCHPATH env var as of this release.

@vsoch - stand by I guess!

p.s. 0.1 and 0.2 were very early releases so it's probably overkill to support them.

vsoch commented 1 year ago

I think Vosch is not the droid that you seek! :robot:

This vsoch is standing by!

grondo commented 1 year ago

From an offline discussion, probably the best thing to do here is to have the spack package generate a pmix.lua file that loads the plugin explicitly and place that pmix.lua into the directory it is now prepending to FLUX_SHELL_RC_PATH. This mimics installation of the plugin in a default shell plugin path so it will always be loaded.

vsoch commented 1 year ago

Yep I can do that! What should be the content of that file?

garlick commented 1 year ago

I think something like plugin.load ("pmix.so") except the full path to the shell plugin.

vsoch commented 1 year ago

We could also just wait for this to be fixed in flux-pmix so the change to spack only has to go in once. I'm not sure I'm motivated enough to want to figure out writing this custom logic in spack.

grondo commented 1 year ago

I'm not sure there is anything to fix in flux-pmix. The file we're adding could be thought of as a config file that is required because there is no way to install the .so in the default search path of the flux-core package in spack.

I can make an attempt.

vsoch commented 1 year ago

okay! if you want to tweak my PR, that works. Or if you copy pasta a suggestion here I'll get it into the PR. Thank you!

grondo commented 1 year ago

Ok, I'm not sure what I'm doing, but this is along the lines of what I'm trying. I'm not sure I even remember how to test this locally (without waiting a day for everything to rebuild):



````diff --git a/packages/flux-pmix/package.py b/packages/flux-pmix/package.py
index 593d25d636..afbe6d8416 100644
--- a/packages/flux-pmix/package.py
+++ b/packages/flux-pmix/package.py
@@ -40,11 +40,16 @@ class FluxPmix(AutotoolsPackage):
     @run_after("install")
     def add_pluginpath(self):
         spec = self.spec
+        pluginpath = join_path(self.prefix.lib, "flux/shell/plugins/pmix.so")
         if spec.satisfies("@:0.3.0"):
             rcfile = join_path(self.prefix.etc, "flux/shell/lua.d/mpi/openmpi@5.lua")
-            filter_file(
-                r"pmix/pmix.so", join_path(self.prefix.lib, "flux/shell/plugins/pmix/pmix.so"), rcfile
-            )
+            filter_file( r"pmix/pmix.so", pluginpath)
+        else:
+            rcdir = join_path(self.prefix.etc, "flux/shell/lua.d")
+            rcfile = join_path(rcdir, "pmix.lua")
+            mkdirp(rcdir)
+            with open(rcfile, "w") as fp:
+                fp.write("plugin.load(\"" + pluginpath + "\")")

     def setup_run_environment(self, env):
         env.prepend_path("FLUX_SHELL_RC_PATH", join_path(self.prefix, "etc/flux/shell/lua.d"))
grondo commented 1 year ago

We also should prepend prefix.lib + "flux/upmi/plugins" to the FLUX_PMI_CLIENT_SEARCHPATH env var as of this release.

Oh, I also forgot to do this. Updated diff:

diff --git a/packages/flux-pmix/package.py b/packages/flux-pmix/package.py
index 593d25d636..131030203b 100644
--- a/packages/flux-pmix/package.py
+++ b/packages/flux-pmix/package.py
@@ -40,11 +40,18 @@ class FluxPmix(AutotoolsPackage):
     @run_after("install")
     def add_pluginpath(self):
         spec = self.spec
+        pluginpath = join_path(self.prefix.lib, "flux/shell/plugins/pmix.so")
         if spec.satisfies("@:0.3.0"):
             rcfile = join_path(self.prefix.etc, "flux/shell/lua.d/mpi/openmpi@5.lua")
-            filter_file(
-                r"pmix/pmix.so", join_path(self.prefix.lib, "flux/shell/plugins/pmix/pmix.so"), rcfile
-            )
+            filter_file( r"pmix/pmix.so", pluginpath)
+        else:
+            rcdir = join_path(self.prefix.etc, "flux/shell/lua.d")
+            rcfile = join_path(rcdir, "pmix.lua")
+            mkdirp(rcdir)
+            with open(rcfile, "w") as fp:
+                fp.write("plugin.load(\"" + pluginpath + "\")")

     def setup_run_environment(self, env):
         env.prepend_path("FLUX_SHELL_RC_PATH", join_path(self.prefix, "etc/flux/shell/lua.d"))
+        if spec.satisfies("@0.3.0:"):
+            env.prepend_path("FLUX_PMI_CLIENT_SEARCHPATH", join_path(self.prefix, "flux/upmi/plugins"))
vsoch commented 1 year ago

Okay - build was a success! I'll get it merged in, and then built -> PR to spack. Thanks for the help!