E3SM-Project / spack

A flexible package manager that supports multiple versions, configurations, platforms, and compilers.
https://spack.io
Other
1 stars 2 forks source link

Spack build failing on multiple platforms due to export_albany.in error #6

Closed ikalash closed 1 year ago

ikalash commented 1 year ago

Steps to reproduce

TBD

Error message

The following error is showing up in the Albany spack build on multiple systems:

cwd: /pscratch/sd/i/ikalash/spackAlbany/spack-stage-albany-develop-ae6poivqdwvmpt7ksamtaciq4nqwoijf/spack-build-ae6poiv/dummy
Traceback (most recent call last):
  File "/pscratch/sd/i/ikalash/spackAlbany/spack-stage-albany-develop-ae6poivqdwvmpt7ksamtaciq4nqwoijf/spack-src/cmake/CreateExportAlbany", line 185, in <module>
    _main_func(__doc__)
  File "/pscratch/sd/i/ikalash/spackAlbany/spack-stage-albany-develop-ae6poivqdwvmpt7ksamtaciq4nqwoijf/spack-src/cmake/CreateExportAlbany", line 178, in _main_func
    run (bin_dir,install_lib_dir,generator)
  File "/pscratch/sd/i/ikalash/spackAlbany/spack-stage-albany-develop-ae6poivqdwvmpt7ksamtaciq4nqwoijf/spack-src/cmake/CreateExportAlbany", line 149, in run
    raise ValueError
ValueError
make[2]: *** [src/CMakeFiles/create_export_albany.dir/build.make:76: export_albany.in] Error 1
make[2]: Leaving directory '/pscratch/sd/i/ikalash/spackAlbany/spack-stage-albany-develop-ae6poivqdwvmpt7ksamtaciq4nqwoijf/spack-build-ae6poiv'
make[1]: *** [CMakeFiles/Makefile2:1316: src/CMakeFiles/create_export_albany.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[ 99%] Linking CXX executable dummy

@bartgol has agreed to look at this. I will post instructions for how to reproduce the error on one of the machines we all have access to in the comments on this issue shortly.

Information on your system

TBD

General information

ikalash commented 1 year ago

@bartgol : I think the easiest place to reproduce this is on perlmutter - I hope this is OK. Here is how you can reproduce the error:

git clone git@github.com:E3SM-Project/spack.git
cd spack
git checkout develop
cp https://github.com/sandialabs/Albany/blob/master/doc/LandIce/perlmutter/spack/compilers.yaml ~/.spack/cray/
cp https://github.com/sandialabs/Albany/blob/master/doc/LandIce/perlmutter/spack/packages.yaml ~/.spack
Edit line 62 of etc/spack/defaults/config.yaml to have /pscratch/sd/i/$USER/spackAlbany
. share/spack/setup-env.sh
spack --insecure install --dirty --keep-stage albany%gcc@11.1.0+mpas

This will take a long time to run. At the end, you should see the error. The good thing is that if you run the command again, it will take much less time b/c it will not rebuild all the TPLs, just Albany. Let me know if you have any questions or issues.

xylar commented 1 year ago

@ikalash and @bartgol, I think the error log you have is missing one critical line:

could not locate lib: /pscratch/sd/x/xylar/spack_temp/spack-stage/spack-stage-albany-develop-bwtwzp5yszkspqzd33etbchwgygtgafi/spack-build-bwtwzp5/dummy/-L/opt/cray/pe/gcc/11.2.0/snos/lib64
cwd: /pscratch/sd/x/xylar/spack_temp/spack-stage/spack-stage-albany-develop-bwtwzp5yszkspqzd33etbchwgygtgafi/spack-build-bwtwzp5/dummy
Traceback (most recent call last):
  File "/pscratch/sd/x/xylar/spack_temp/spack-stage/spack-stage-albany-develop-bwtwzp5yszkspqzd33etbchwgygtgafi/spack-src/cmake/CreateExportAlbany", line 185, in <module>
    _main_func(__doc__)
  File "/pscratch/sd/x/xylar/spack_temp/spack-stage/spack-stage-albany-develop-bwtwzp5yszkspqzd33etbchwgygtgafi/spack-src/cmake/CreateExportAlbany", line 178, in _main_func
    run (bin_dir,install_lib_dir,generator)
  File "/pscratch/sd/x/xylar/spack_temp/spack-stage/spack-stage-albany-develop-bwtwzp5yszkspqzd33etbchwgygtgafi/spack-src/cmake/CreateExportAlbany", line 149, in run
    raise ValueError
ValueError
make[2]: *** [src/CMakeFiles/create_export_albany.dir/build.make:76: export_albany.in] Error 1
make[2]: Leaving directory '/pscratch/sd/x/xylar/spack_temp/spack-stage/spack-stage-albany-develop-bwtwzp5yszkspqzd33etbchwgygtgafi/spack-build-bwtwzp5'
make[1]: *** [CMakeFiles/Makefile2:1316: src/CMakeFiles/create_export_albany.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

The code is:

           # We want to get an abs path, with symlinks resolved *except* for symlinks
            # in the file name, to avoid an error we're seeing where libdl.so points
            # to the file libdl-2.28.so (an odd name: usually we see libdl.so.2.28)
            lib_file_full = pathlib.Path(item).parent.resolve() / pathlib.Path(item).name
            if not lib_file_full.exists():
                print (f"could not locate lib: {lib_file_full}")
                print (f"cwd: {os.getcwd()}")
                raise ValueError

Note that -L/opt/cray/pe/gcc/11.2.0/snos/lib64 is getting turned into an absolute path for some reason:

/pscratch/sd/x/xylar/spack_temp/spack-stage/spack-stage-albany-develop-bwtwzp5yszkspqzd33etbchwgygtgafi/spack-build-bwtwzp5/dummy/-L/opt/cray/pe/gcc/11.2.0/snos/lib64

Also, that's a total abuse of raising a python error :-)

xylar commented 1 year ago

I'm happy to help debug further but it seems like -L also needs special treatment, similar to -l. But it may be that you actually want to pull it apart and turn the path into an abs path.

bartgol commented 1 year ago

@xylar thanks for the extra snippet. I will prob wait till after the spack talk today, then will try to get on PM while notions are fresh, and try it out. Hopefully is just a quick fix.

xylar commented 1 year ago

@bartgol, I'm sure you don't need more examples but I'm seeing the same issue in one form or another on all of the other machines I tested one.

Cori-Haswell:

could not locate lib: /global/cscratch1/sd/xylar/spack_temp/spack-stage/spack-stage-albany-develop-jq7duh5mgf7cuu622g3guvyvxp7kvue7/spack-build-jq7duh5/dummy/-L

Anvil:

could not locate lib: /lcrc/group/e3sm/ac.xylar/spack_temp/ac.xasay-davis/spack-stage/spack-stage-albany-develop-vgm4audz5tnplhzdfgb5wmo5apy7sahk/spack-build-vgm4aud/dummy/-L/gpfs/fs1/software/centos7/spack-latest/opt/spack/linux-centos7-x86_64/gcc-6.5.0/gcc-8.2.0-xhxgy33/lib64

Compy:

could not locate lib: /compyfs/asay932/tmp_spack/spack-stage/spack-stage-albany-develop-hk3wcnkgmtco2rl3iu7inp35cvfcahjn/spack-build-hk3wcnk/dummy/-L/qfs/projects/ops/rh7/apps/gcc/10.2.0/lib64

Chrysalis:

could not locate lib: /lcrc/group/e3sm/ac.xylar/spack_temp/ac.xasay-davis/spack-stage/spack-stage-albany-develop-6worn4v5bmzdczst7vu4k7qd567te5xk/spack-build-6worn4v/dummy/-L/gpfs/fs1/soft/chrysalis/spack/opt/spack/linux-centos8-x86_64/gcc-9.3.0/gcc-9.2.0-ugetvbp/lib64
bartgol commented 1 year ago

Interesting. I wonder why we don't get this error in any of our nightly tests. I am hoping to get to this today.

xylar commented 1 year ago

I think this is what @ikalash was explaining yesterday. The tests build everything in spack and don't use any system libraries. Thus, there presumably aren't any -L flags, just full paths to the libraries themselves. When I build, I point to system modules for several libraries, which must get converted to -L flags. It seems like the code as written isn't handling -L.

bartgol commented 1 year ago

We do have -L stuff when "manually" building Albany (without spack), so it should handle lib paths. Anyhow, I'll check this today and report back.

bartgol commented 1 year ago

Edit line 62 of etc/spack/defaults/config.yaml to have /pscratch/sd/i/$USER/spackAlbany

Line 62 is a comment for me. I'm not sure if I need to add an entry to the build_stage list (and at what position), or replace one of them.

 61   # The build stage can be purged with `spack clean --stage` and
 62   # `spack clean -a`, so it is important that the specified directory uniquely
 63   # identifies Spack staging to avoid accidentally wiping out non-Spack work.
 64   build_stage:
 65     - $tempdir/$user/spack-stage
 66     - $user_cache_path/stage
 67   # - $spack/var/spack/stage
bartgol commented 1 year ago

Ok, I tried setting

config:
  build_stage:
    - /pscratch/sd/b/bartgol/spack-stage

in ~/.spack/config.yaml (from the config.yaml doc, it seemed like the proper place to add user-specific configs). Spack is currently running, but right off the bat I get some warnings:

==> Warning: detected deprecated properties in /global/homes/b/bartgol/.spack/packages.yaml
Activate the debug flag to have more information on the deprecated parts or run:

    $ spack config update packages

to update the file to the new format

==> Warning: using "zlib@1.2.11" which is a deprecated version
==> Warning: Missing a source id for albany@develop
==> Warning: Missing a source id for trilinos-for-albany@develop

Are these expected? Should I ignore them?

xylar commented 1 year ago

Yes, those are expected.

xylar commented 1 year ago

Thanks @bartgol! Because of your fix, I'm now able to install Albany on all of compass' supported machines (Cori-Haswell, Perlmutter-CPU, Chrysalis, Anvil and Compy).

ikalash commented 1 year ago

Excellent!