Closed xylar closed 1 year ago
I'll take a look at this. I had a few questions @xylar about what you tried:
@ikalash, were you able to find the attached bash and yaml files that I posted in the slack channel? That should answer all these questions. I use (and very much need to use) the system-built compilers, mpi, hdf5, netcdf, pnetcdf, etc.
Yes I found them. Putting them here for easier reference. Thanks for clarifying. build_dev_compass_1_2_0-alpha_2_gnu_mpich_albany.bash.txt dev_compass_1_2_0-alpha_2_gnu_mpich_albany.yaml.txt
in case it's useful here you can find the modules and scripts we use to build MALI on Perlmutter https://github.com/sandialabs/Albany/tree/master/doc/LandIce/perlmutter
spack-build-out.txt Hmmm, so the regular nightly spack build I have is failing to build Albany. The build log is attached. There is an MPI error:
/projects/albany/nightlySpackBuild/spack/opt/spack/linux-rhel7-ivybridge/gcc-9.2.0/trilinos-for-albany-develop-hajhfstqrqodd4jdqghcn52umc6552sw/lib/libgtest.so.13.5 -lgfortran
libalbany_ut_main.so: undefined reference to `ompi_mpi_cxx_op_intercept'
libalbany_ut_main.so: undefined reference to `MPI::Win::Free()'
libalbany_ut_main.so: undefined reference to `MPI::Datatype::Free()'
libalbany_ut_main.so: undefined reference to `MPI::Comm::Comm()'
collect2: error: ld returned 1 exit status
make[2]: *** [tests/unit/string_utils] Error 1
Please comment if you have any thoughts. I'll see if this shows up in any of the other nightlies. It is not related to the Perlmutter error, which is a configure error, but we should resolve it.
@xylar : could you also please share your packages.py file, so I can compare it with the one I am making on perlmutter?
@xylar : so for me, Trilinos compiled. I used this compiler module load gcc/11.2.0
and not any other module TPLs (I will try that next). I am attaching my compilers.yaml file. I built using the following command spack --insecure install --dirty --keep-stage albany%gcc@11.1.0+mpas
.
Now, unfortunately my build of Albany was not successful due to that export_albany.in error that appeared/disappeared for no apparent reason. I am not sure what to make of it. The error log is attached. I could try again tomorrow from scratch to see if it goes away. spack-build-out.txt
cwd: /pscratch/sd/i/ikalash/spackAlbany/spack-stage-albany-develop-ae6poivqdwvmpt7ksamtaciq4nqwoijf/spack-build-ae6poiv/dummy
Traceback (most recent call last):
File "/pscratch/sd/i/ikalash/spackAlbany/spack-stage-albany-develop-ae6poivqdwvmpt7ksamtaciq4nqwoijf/spack-src/cmake/CreateExportAlbany", line 185, in <module>
_main_func(__doc__)
File "/pscratch/sd/i/ikalash/spackAlbany/spack-stage-albany-develop-ae6poivqdwvmpt7ksamtaciq4nqwoijf/spack-src/cmake/CreateExportAlbany", line 178, in _main_func
run (bin_dir,install_lib_dir,generator)
File "/pscratch/sd/i/ikalash/spackAlbany/spack-stage-albany-develop-ae6poivqdwvmpt7ksamtaciq4nqwoijf/spack-src/cmake/CreateExportAlbany", line 149, in run
raise ValueError
ValueError
make[2]: *** [src/CMakeFiles/create_export_albany.dir/build.make:76: export_albany.in] Error 1
make[2]: Leaving directory '/pscratch/sd/i/ikalash/spackAlbany/spack-stage-albany-develop-ae6poivqdwvmpt7ksamtaciq4nqwoijf/spack-build-ae6poiv'
make[1]: *** [CMakeFiles/Makefile2:1316: src/CMakeFiles/create_export_albany.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[ 99%] Linking CXX executable dummy
I hacked the build to get it to compile and ran the tests. All tests passed. So I think we can get a working build on perlmutter with spack modulo some of these strange hiccups...
@ikalash, that sounds promising.
could you also please share your packages.py file, so I can compare it with the one I am making on perlmutter?
I never create one. I rely on the environment yaml file entirely. If this is something that spack generates automatically, I can dig it up.
I used this compiler module load gcc/11.2.0 and not any other module TPLs (I will try that next).
I need to use the same compilers and libraries as E3SM unless there is a strong reason to break that connection. Otherwise, our testing doesn't reflect what we end up seeing when we run E3SM with the same components. I realize this is less of an issue for MALI than other components right now but it's also a very big pain for me to maintain a separate compiler/mpi/library stack for MALI than for MPASO.
I am attaching my compilers.yaml file.
If that got attached, I missed it.
Sorry about this. It is attached now. I don't know how to get spack to use pre-installed TPLs w/o an packages.yaml file. Perhaps we can discuss this at the Nov. 8 meeting.
Thanks!
My understanding is that's one of the main points of having an environment.yaml file, that is lets you work with external modules, whereas you don't really have the level of control you need to make them useful any other way. But the truth is I don't ever do anything outside of an environment.
So I created a packages.yaml file for perlmutter that reflects the packages we want to use on perlmutter for Albany (https://github.com/sandialabs/Albany/blob/master/doc/LandIce/perlmutter/pm_cpu_gnu_modules.sh), which is attached, and was able to build Albany with one caveat - there is again the error about export_albany.in:
wd: /pscratch/sd/i/ikalash/spackAlbany2/spack-stage-albany-develop-464dkohafi6xoko6ukjqoxeae7pygw5c/spack-build-464dkoh/dummy
Traceback (most recent call last):
File "/pscratch/sd/i/ikalash/spackAlbany2/spack-stage-albany-develop-464dkohafi6xoko6ukjqoxeae7pygw5c/spack-src/cmake/CreateExportAlbany", line 185, in <module>
_main_func(__doc__)
File "/pscratch/sd/i/ikalash/spackAlbany2/spack-stage-albany-develop-464dkohafi6xoko6ukjqoxeae7pygw5c/spack-src/cmake/CreateExportAlbany", line 178, in _main_func
run (bin_dir,install_lib_dir,generator)
File "/pscratch/sd/i/ikalash/spackAlbany2/spack-stage-albany-develop-464dkohafi6xoko6ukjqoxeae7pygw5c/spack-src/cmake/CreateExportAlbany", line 149, in run
raise ValueError
ValueError
make[2]: *** [src/CMakeFiles/create_export_albany.dir/build.make:76: export_albany.in] Error 1
make[2]: Leaving directory '/pscratch/sd/i/ikalash/spackAlbany2/spack-stage-albany-develop-464dkohafi6xoko6ukjqoxeae7pygw5c/spack-build-464dkoh'
make[1]: *** [CMakeFiles/Makefile2:1316: src/CMakeFiles/create_export_albany.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[ 99%] Linking CXX executable dummy
@bartgol I am afraid this is a real issue that is showing up in the spack build that needs to be debugged, as I am seeing it again on a number of platforms. I will open an issue and assign it to you. I'm happy to provide instructions for reproducing the problem, if you tell me a machine of choice (e.g., blake, cee, etc.). I'd first verify that the error shows up on that machine before handing things off to you.
@xylar , can you please try building on perlmutter with my compiler.yaml and packages.yaml file and see how far things get for you? Hopefully everything will build (except Albany at the very end due to the aforementioned error). I am using the following line to build:
spack --insecure install --dirty --keep-stage albany%gcc@11.1.0+mpas
@ikalash, thank you! I will try that out and let you know how it goes.
@ikalash feel free to assign me. Unfortunately I have very few cycles in the short term, so I won't get to it this week (more likely later next week).
I have never used spack, so any instruction to reproduce would be greatly appreciated. Any OHPC machine is fine (blake, weaver, mappy). I don't know if I can log in onto the cee ones. I think I tried not long ago and it bounced me.
FYI, I've checked in the files to the Albany repo, and they can be found here: https://github.com/sandialabs/Albany/tree/master/doc/LandIce/perlmutter/spack .
Ok, thanks. I can hop on PM when I debug this, so no need to reproduce on another machine.
@bartgol I will reproduce this on blake and send you instructions. Next week to look at this is fine. It'll make more sense after the spack tutorial anyway. Please stay tuned. The perlmutter stuff was more for Xylar and for future reference.
@ikalash, I don't usually build with a compilers.yaml
and a packages.yaml
so for the tutorial next week it might make sense for you to cover this approach. I'll still try to figure out how it's done in the meantime.
@xylar , that's the plan! For now you can see the wiki page I created / updated: https://github.com/sandialabs/Albany/wiki/Building-Albany-using-SPACK
Perfect, thanks!
I added a missing module (libfabric
) and now get the same error as in #6, so I'm going to close this issue in favor of that more informative one.
@ikalash and @mperego,
Hitting you on GitHub as well as slack.
I'm finally building compass on Perlmutter! I ran into trouble building the spack trilinos-for-albany package and was hoping you could help me out. The error I get isn't very helpful (at least to me):
Full build and logs are here:
Let me know if I messed up the permissions.
On Slack, I attached the files I use to build (not allowed on GitHub). You should be able to run them yourselves if you point to appropriate directories (i.e. on your pscratch, not mine).
This isn't particularly urgent but would be very nice to solve before Cori disappears in January.