idaholab / moose

Multiphysics Object Oriented Simulation Environment
https://www.mooseframework.org
GNU Lesser General Public License v2.1
1.76k stars 1.05k forks source link

Installing Moose with external dependencies ? #26041

Open mboisson opened 11 months ago

mboisson commented 11 months ago

Reason

We already have many versions of PETSc, LibMesh, WASP and others. It does not have to download and install other versions.

Design

Let Moose's own "configure" script detect existing dependencies and use those instead of insisting on downloading, updating and building its own.

milljm commented 11 months ago

Are we talking about the update_and_rebuild_petsc|libmesh|wasp.sh scripts? If so, these scripts build/install those libraries from their respected submodule at the HASH specific to that time period in MOOSE's history. The version(s) have been vetted as stable/working (there is only one precise version).

Or is this Conda package related?

If this is not related to the above, I apologize. I am not understanding where the 'other versions' is coming from.

mboisson commented 11 months ago

This is about the "update_and_rebuild" scripts. This is not conda related.

This is for installing on an HPC cluster. We already have thousands of different modules for scientific software and libraries (https://docs.alliancecan.ca/wiki/Available_software), including the dependencies of Moose. We know very well how to properly build all of those libraries for our environment. We do not want Moose to re-download and reinstall those (and the build scripts do not work anyway).

mboisson commented 11 months ago

Another issue that a user found is that the update_and_rebuild_libmesh looks for libraries in hard-coded locations, instead of using the build tools to find where.

It is for example barfing on :

-- Done configuring core library features ---
---------------------------------------------
---------------------------------------------
----- Configuring for optional packages -----
---------------------------------------------
checking for built-in XDR support... no
checking for XDR support in /usr/include/tirpc... no
configure: error: *** XDR was not found, but --enable-xdr-required was specified.
Running make -j 6...
make: *** No targets specified and no makefile found.  Stop.

/usr/include/tirpc does not exist in this location. Anything under /usr or /lib* on our clusters should not be used. We provide all of those in a separate location, which is automatically found by our compilers and build tools.

milljm commented 11 months ago

I can't speak for how libMesh searches for dependencies but @roystgnr would know! I am sure there are some influential environment variables I am unaware of.

Long story short:

MOOSE should be good to go if you provide paths to following three libraries:

PETSC_DIR=/some/path/to/petsc
LIBMESH_DIR=/some/path/to/libmesh
WASP_DIR=/some/path/to/wasp

If these are set, there is no need to run update_and_rebuild_petsc|libmesh|wasp.

More Details

PETSc

PETSC_DIR must be set and pointing to the PETSc installation directory. One constraint, PETSc must be built with HYPRE enabled. MOOSE will die a horrible death if not. 😄

Some aspects of MOOSE or MOOSE-based applications will make use of other contributions to PETSc, like; SLEPc, MUMPS, strumpack, metis, parmetis, super-lu, etc. It's probably best to enable them all. Which, I am sure is already be done.

libMesh

LIBMESH_DIR must be set and pointing to the libMesh installation directory.

MOOSE calls libmesh-config, to figure out how to properly build on your system:

$LIBMESH_DIR/bin/libmesh-config --cppflags --cxxflags --include --cxx --cc --fc

WASP

WASP_DIR must be set and pointing to the WASP installation directory.

That is pretty much the gist of what MOOSE requires from these three libraries. Any way you can provide these, MOOSE shouldn't complain where they reside. Your mileage may vary when problem solving, depending on what version of these libraries you load. We do extensive testing on the versions of these libraries at their submodule HASH (millions of tests per week), which is what is used when you run update_and_rebuild_petsc|libmesh|wasp.

mboisson commented 11 months ago

Thanks, I will tell my user to try that. Once those are defined, is the way to build moose to still go in the test folder and to run make there (that seems a little strange to me).

permcody commented 11 months ago

I understand the frustration of dealing with tons of seemingly duplicate libraries and packages to build an application like MOOSE. One thing to keep in mind is that just because you have MPI or PETSc installed, it doesn't mean just pointing to all those installations will necessarily work with MOOSE. As you may or may not be aware, there's a very close relationship of the software stack that MOOSE relies upon. Each package in the stack relies on the stack below it so everything must be built in a consistent environment. If you build PETSc with on MPI version on your system an libMesh with another, they will not work together. They may not even work if slightly different compilers or other dependencies changed between those builds. Everything must work together, which is why we generally recommend you build everything from the ground up. Also, we use a very large set of optional packages in both libMesh and PETSc that you may not be configured and built with your system supplied versions.

That all being said, it's frustrating when the build scripts don't work. Generally there are many environment variables that can be set to aid both PETSc and libMesh and finding the dependencies they need. You may need to export a few of those in your environment to succeed as Jason was pointing out.

Finally, for your last question - you do not have to build MOOSE from the test directory, although people frequently find that a convenient way to do it. MOOSE is a library and you can build it from the framework directory. However, if you do that you won't have anything to run or test with, hence the recommendation to build in the test directory, which first builds the framework library, and then a small test application that you can use to run the integration test suite with. For system admins, we do recommend that you go into the modules directory and build there which will build the framework and all of the physics modules + a driver application which can then be make installed into a system location where users can make use of the binary. Users can even supply command line flags to the binary to copy examples or the test suite into their working directory so they can play around with working input files.

Let us know if you have any other questions on getting up and going.

mboisson commented 11 months ago

I understand the frustration of dealing with tons of seemingly duplicate libraries and packages to build an application like MOOSE. One thing to keep in mind is that just because you have MPI or PETSc installed, it doesn't mean just pointing to all those installations will necessarily work with MOOSE. As you may or may not be aware, there's a very close relationship of the software stack that MOOSE relies upon. Each package in the stack relies on the stack below it so everything must be built in a consistent environment. If you build PETSc with on MPI version on your system an libMesh with another, they will not work together. They may not even work if slightly different compilers or other dependencies changed between those builds. Everything must work together, which is why we generally recommend you build everything from the ground up. Also, we use a very large set of optional packages in both libMesh and PETSc that you may not be configured and built with your system supplied versions.

We are pretty used to dealing with these concerns. As HPC package managers and support staff, we deal with such stuff on a daily basis. With a typical well designed module system on an HPC cluster, mixing MPI/compiler/dependencies is pretty much impossible, so that is not a concern. We also build PETSc and libMesh with a large number of optional packages (and if some are missing, we can rebuild them). We would rather know what those required optional packages/options are, than modify hard-coded installation scripts that make assumptions about where and how things are installed.

That all being said, it's frustrating when the build scripts don't work. Generally there are many environment variables that can be set to aid both PETSc and libMesh and finding the dependencies they need. You may need to export a few of those in your environment to succeed as Jason was pointing out.

Finally, for your last question - you do not have to build MOOSE from the test directory, although people frequently find that a convenient way to do it. MOOSE is a library and you can build it from the framework directory. However, if you do that you won't have anything to run or test with, hence the recommendation to build in the test directory, which first builds the framework library, and then a small test application that you can use to run the integration test suite with. For system admins, we do recommend that you go into the modules directory and build there which will build the framework and all of the physics modules + a driver application which can then be make installed into a system location where users can make use of the binary. Users can even supply command line flags to the binary to copy examples or the test suite into their working directory so they can play around with working input files.

Thanks for this insight.

mboisson commented 11 months ago

Ok, that route does not work because apparently, Moose requires libMesh code that has never been released (i.e. using development code, https://github.com/libMesh/libmesh/issues/3708). (build log with libMesh version 1.7.1 here https://gist.github.com/mboisson/386bc5d8b88020330d2c2651928cfa89)

And the build script fails with:

/home/mboisson/tmp/moose-2023-11-08/scripts/configure_libmesh.sh: line 60: ../configure: No such file or directory
Running make -j 6...
make: *** No targets specified and no makefile found.  Stop.

so I am in a dead end.

This is with the latest tag https://github.com/idaholab/moose/tags (2023-11-08)

lindsayad commented 11 months ago

And the build script fails with:

What is the command that you're running?

mboisson commented 11 months ago

update_and_rebuild_libmesh

lindsayad commented 11 months ago

I'm surprised the script failed in that way. Did the libmesh submodule get checked out?

mboisson commented 11 months ago

I don't believe it did. I simply downloaded the tarball from the release page and I run the script. I think the release tarballs are not complete.

milljm commented 11 months ago

If the tag was downloaded as a tarball, and not checked out git co tag, then I can very much see this happening. As tags will not contain any submodules.

mboisson commented 11 months ago

Ah, I see. Do you have actual tarball releases available ? Git clones aren't checksummable (because the checksum changes when creating a tarball of the repository). We checksum every source tarball we build to be sure it does not change if it is needed to reinstall it.

milljm commented 11 months ago

I am not sure there is going to be an easy way to do this without the use of a cloned repository. All of the submodule versioning data is stored in the .git meta directory.

Some of our dependencies also do not revolve around releases. I am scratching my head as to how we would represent these dependencies as 'releases' to be obtained as tarball links from within the MOOSE repository (like in that configure script that failed). I suppose we can try to introduce a URL instead, like so:

curl -L -O https://github.com/libMesh/libmesh/archive/e4812f5b9245831473dd269d4bfd5e8485f536b4.tar.gz

Which would represent the repository HASH at the time (and a means of obtaining that version).

The other tricky bit, is that libMesh for example uses submodules. And recursively so on (metaphysical, timpi, etc). All of which will be required.

I don't think PETSc will be an issue. They have a versioning scheme already. WASP may be an issue, I haven't looked into that submodule yet.

mboisson commented 11 months ago

Indeed, I realized that libMesh itself is using submodules, and they have not created a release in 18 months. Typically, projects that use submodules and make releases will create tarballs separate from what Github creates automatically, which will include all submodules already checked out at the proper version.

PETSc is indeed not an issue as they already make versioned releases.

I am currently building libmesh with the update_and_reubild script (after doing a git clone). It is rebuilding a lot of things which we already have, which we don't like either, but given that they don't yet have a proper release (hopefully soon), we don't have much of a choice.

For libmesh to build, I had to alter CPPPATH to add -I<actual/path/to/include/tirpc> because they hardcode /usr/include/tirpc (https://github.com/libMesh/libmesh/issues/3709) in their check.

I also specified BOOST_DIR, EIGEN_DIR, GLPK_DIR for libmesh to actually use some of the packages we already have.

I am not yet at WASP.

mboisson commented 11 months ago

libmesh built ok this time, but when running make in the test folder, it compiles for a while, but there is this error:

Compiling C++ (in opt mode) /home/mboisson/tmp/moose/framework/src/base/MooseApp.C...
/home/mboisson/tmp/moose/framework/src/base/MooseApp.C: In member function ‘virtual void MooseApp::setupOptions()’:
/home/mboisson/tmp/moose/framework/src/base/MooseApp.C:1033:17: error: cannot declare variable ‘moose_server’ to be of abstract type ‘MooseServer’
 1033 |     MooseServer moose_server(*this);
      |                 ^~~~~~~~~~~~
In file included from /home/mboisson/tmp/moose/framework/src/base/MooseApp.C:54:
/home/mboisson/tmp/moose/framework/build/header_symlinks/MooseServer.h:30:7: note:   because the following virtual functions are pure within ‘MooseServer’:
   30 | class MooseServer : public wasp::lsp::ServerImpl
      |       ^~~~~~~~~~~
In file included from /home/mboisson/tmp/moose/framework/build/header_symlinks/MooseServer.h:16,
                 from /home/mboisson/tmp/moose/framework/src/base/MooseApp.C:54:
/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/wasp/4.0.3/include/wasplsp/ServerImpl.h:219:18: note:     ‘virtual bool wasp::lsp::ServerImpl::gatherDocumentFormattingTextEdits(wasp::DataArray&, int, bool)’
  219 |     virtual bool gatherDocumentFormattingTextEdits(
      |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
make: *** [/home/mboisson/tmp/moose/framework/build.mk:145: /home/mboisson/tmp/moose/framework/src/base/MooseApp.x86_64-pc-linux-gnu.opt.lo] Error 1
mboisson commented 11 months ago

Ah, the version of WASP which we have is too new (version 4), will try with wasp 3

Edit: successfull build with wasp/3.1.4