art-framework-suite / art-root-io

0 stars 2 forks source link

file_info_dumper using spack-built art on AL9 fails #17

Closed kutschke closed 1 month ago

kutschke commented 3 months ago

Describe your request or issue

This is an FYI:

file_info_dumper cannot find the Mu2e dictionary libraries.

A solution is: export LD_LIBRARY_PATH=$CET_PLUGIN_PATH

As best I can tell this requires action on the part of each experiment and cannot be solved generally within art itself. But the solution should be solialized.

Some background: Mu2e builds all of our AL9 code with relative RPATHS - so developers can continue to use the grid tarball method for running grid jobs. So we do not need to set LD_LIBRARY_PATH for our art jobs. This is the first time we have encountered the need to set it.

Is this a high-priority issue? No

kutschke commented 3 months ago

After discussion with Ray, he asks that we not have to routinely set LD_LIBRARY_PATH because that has casued some trouble recently with our AL9 builds.

It occurs to me that POSIX has

     #include <stdlib.h>
     int setenv(const char *name, const char *value, int overwrite);

So file_info_dumper could internally forward CET_PLUGIN_PATH to LD_LIBRRAY_PATH. This would solve the immediate problem without polluting the environment of the caller.

knoepfel commented 2 months ago

@kutschke, ROOT requires LD_LIBRARY_PATH to be set for loading dictionaries--this is not an art requirement. When setting up a UPS product, LD_LIBRARY_PATH is automatically updated and not in the face of the user. You should see similar behavior for any Spack package that directly depends on root. We have verified this when loading art-root-io:

knoepfel@scisoftbuild02 ~ $ echo $LD_LIBRARY_PATH

knoepfel@scisoftbuild02 ~ $ spack load art-root-io/xlat6e4
knoepfel@scisoftbuild02 ~ $ echo $LD_LIBRARY_PATH | tr : \\n
/cvmfs/scisoft.opensciencegrid.org/spack-v0.22.0-fermi/opt/spack/linux-almalinux9-x86_64_v3/gcc-13.3.0/canvas-root-io-1.14.00-hki22nvpgbkzyifcww2fkgu5iiuj7aul/lib
/cvmfs/scisoft.opensciencegrid.org/spack-v0.22.0-fermi/opt/spack/linux-almalinux9-x86_64_v3/gcc-13.3.0/intel-oneapi-tbb-2021.12.0-42csb3eneseva45asyp763b5r56e5bce/tbb/2021.12/lib/intel64/gcc4.8

For each Mu2e package that contains dictionaries, please make sure the corresponding Spack recipe has a depends_on("root ...") clause. That will ensure that LD_LIBRARY_PATH is correctly updated to point to the directories with the dictionary libraries.

rlcee commented 2 months ago

Here is our situation with LD_LIBRARY_PATH. In our environments, both in phase 2 (scons+spack) and in phase 3 (spack only) we do not have LD_LIBRARY_PATH set. We do not see errors or complaints, so maybe CET_PLUGIN_PATH, ROOT_INCLUDE_PATH and RPATH is doing the work. Maybe we are get getting too much from autoload and ROOT_INCLUDE_PATH? Let me know.

But this implies that file_info_dumper is unique, or we are using it in a way we don't usually use art and root.

I found only canvas-root-io sets LD_LIBRARY_PATH. I demonstrated this by activating a mu manifest env, find it set, then comment out the one line in canvas-root-io recipe, and then it is not set. Why is this line here and only here?

I would like to continue to not set LD_LIBRARY_PATH for the following reason. When a package sets it in an environment with a view, it points it to the environment view link directory, which effectively adds all the libraries in the view to the path. This can cause incorrect loading when mixing executables built in spack with RPATH and ones built outside spack. I believe it would probably be OK, if in an env, LD_LIBRARY_PATH was set to point to only the root install lib directory, and not the env, or maybe find a subdirectory in the env, something like that. But that's the general root case. In the file_info_dumper case, we not only need the root libs in the path, we also need our offline libs (at least to suppress warnings). Using CET_PLUGIN_PATH here would be perfect, since it has all needed dictionaries and won't interfere with anything outside art/root.

For the moment, we are wrapping file_info_dumper with a script that sets LD_LIBRARY_PATH, and I think we could live like that.

knoepfel commented 2 months ago

@rlcee, thank you for the explanation. We will be exploring these issues and likely coming back to you with questions.

knoepfel commented 2 months ago

@kutschke and @rlcee, the problem is understood. art’s plugin-library manager prepends the value of CET_PLUGIN_PATH to LD_LIBRARY_PATH (or assigns it if LD_LIBRARY_PATH is empty) during the running of the framework job. This makes it possible for ROOT to find the Mu2e dictionaries during the job while restoring LD_LIBRARY_PATH to its original value once the job completes. So LD_LIBRARY_PATH always appears empty in the shell, but it is in fact adjusted in a framework job.

The file_info_dumper executable, however, does not use a plugin-library manager (no plugins to load) and it thus leaves LD_LIBRARY_PATH alone/empty. Thus ROOT cannot find the dictionaries.

Your caution @rlcee about not setting LD_LIBRARY_PATH is a good one. Fortunately, ROOT allows the setting of a different environment variable for locating dictionaries: ROOT_LIBRARY_PATH.

Our suggestion:

  1. Mu2e should ensure that ROOT_LIBRARY_PATH is set appropriately in its Spack environments. This could be achieved by adjusting the appropriate setup_run_environment(...) or setup_dependent_run_environment(...) methods in Spack recipes, and we will look at the ones we maintain to see what can be achieved on our end. For the time being, I suggest adjusting the file_info_dumper wrapper script to set ROOT_LIBRARY_PATH instead of LD_LIBRARY_PATH.
  2. We will remove the automatic adjustment of LD_LIBRARY_PATH from the art framework. That's not desirable in a Spack context. You will then get consistent behavior when using art vs. using file_info_dumper.
rlcee commented 2 months ago

This sounds great. One question - will root look in ROOT_LIBRARY_PATH to resolve all its shared object libs or just dictionaries? If the former, then we still have to be careful to not put random libraries in this path. If the latter, then we can put all libraries in this path (simpler and more automatic) and there is no danger of confusion. Thanks

knoepfel commented 2 months ago

@rlcee, ROOT uses ROOT_LIBRARY_PATH (and LD_LIBRARY_PATH) to locate the .so files it needs to explicitly load, such as dictionary libraries or whatever. Symbols in those libraries, though, are located/loaded using the standard library-loading mechanisms wrt. rpath, LD_LIBRARY_PATH, etc.

The main benefit of this change is that the "normal" library-loading mechanisms of using rpath and LD_LIBRARY_PATH no longer need to be polluted by entries for ROOT's use. I think this addresses the issue you were concerned about...?

knoepfel commented 2 months ago

This issue is partly addressed with https://github.com/art-framework-suite/cetlib/commit/24f9ba8ab96da3aaa780025eb2d92817841d6627. This change now ensures that LD_LIBRARY_PATH will not be adjusted for either the art executable case or file_info_dumper (etc.). IOW, in order for art and file_info_dumper to use the Mu2e dictionaries, the ROOT_LIBRARY_PATH environment variable must be specified.

To the question now is whether ROOT_LIBRARY_PATH can/should be adjusted by the ROOT-dependent recipes. We'll take a look at that.

knoepfel commented 1 month ago

The directories for ROOT-dependent Spack packages (i.e. those recipes that specify depends_on("root")) are now automatically added to the ROOT_LIBRARY_PATH environment variable whenever the ROOT-dependent package is loaded. See https://github.com/spack/spack/pull/45109. These changes have been merged into Fermilab's Spack fork.