fangohr / octopus-in-spack

Develop Octopus in spack (software packaging)
BSD 3-Clause "New" or "Revised" License
3 stars 4 forks source link

Separate serial and MPI dependencies #52

Closed iamashwin99 closed 2 years ago

iamashwin99 commented 2 years ago

Octopus can be compiled with and without MPI, so this PR removes MPI as a dependency, introduces it as a variant, and chooses the appropriate compilers based on the variant chosen. The PR also compiles fftw, libvdwxc with and without MPI.

fangohr commented 2 years ago

Very good, thank you.

As it is with this PR, we are not building the MPI variant of octopus in our tests. (See https://github.com/fangohr/octopus-in-spack/blob/main/Dockerfile#L64).

It would be good to build it, as - presumably - that is the more complicated build.

Options I can think of:

  1. first build serial version (https://github.com/fangohr/octopus-in-spack/blob/main/Dockerfile#L64), then build a second variant of octopus with the parallel version. Running tests may be tricky: we need to tell spack with version we want to run for the tests (I guess this could be done, I just haven't done it yet). On the positive side, this would re-use many of the 3rd party libraries we have built already.

  2. introduce a new workflow: cleaner, but also duplication, and duplication of carbon emitting builds (as we will build zlib twice: once for the ~mpi build and once for the +mpi build (in different containers).

  3. Only test the MPI version: if that works, presumably the serial version will also work.

Views?

iamashwin99 commented 2 years ago

Let me try the first option to see how it goes, if it takes too long with both versions compiled on the same container, we can split it into two workflows and keep it clean.

iamashwin99 commented 2 years ago

The updated dockerfile, take about the same time (about 5 mins more) to compile, while also testing both the variants

iamashwin99 commented 2 years ago

@fangohr can we re-run the spack-develop workflow, which failed recently ( looks like due to network issues)

iamashwin99 commented 2 years ago

I noticed for the MPI version, spack is not recompiling octopus but rather using the same instance as from the serial one, converting this PR into a draft until we can solve this.

fangohr commented 2 years ago

Well spotted. The problem is visible here: https://github.com/fangohr/octopus-in-spack/pull/52/checks#step:3:931

This means that when we request the 'serial' version, spack is already activating mpi. So the line

https://github.com/fangohr/octopus-in-spack/blob/861de44537709ecfb068b3cf0455836655f96518/Dockerfile#L71

(and also the one for the spack install) needs to be changed to include ~mpi.

Probably it is the default to use +mpi (see https://github.com/fangohr/octopus-in-spack/blob/861de44537709ecfb068b3cf0455836655f96518/spack/package.py#L41).

iamashwin99 commented 2 years ago

Thanks for your suggestion, it solves the problem. The latest CI logs show that both MPI and serial variants are built and tested without errors.

fangohr commented 2 years ago

Good progress; thank you.

Would it be consistent and a good idea to explicitly use the switches for all the variants that the package provides (https://github.com/fangohr/octopus-in-spack/blob/main/spack/package.py#L41) even if we just repeat the default (such as for ~debug). This way, the Dockerfile command line (for spack install) would also be explicit documentation about the variants that are available.

On the other hand, it would be a mixture between overriding some defaults (such as ~mpi) and using some of the defaults (at least for the variant switches we have not listed yet such as ~debug).

I am undecided. Any views?

iamashwin99 commented 2 years ago

Interesting, same error as the one with copy even though the name space is still correctly used as fs.copy. The only change that happened from the previous commit is moving the location of import llnl.util.filesystem as fs to the top of from spack.package import * instead of below it. This reformatting was done however by black

fangohr commented 2 years ago

Interesting, same error as the one with copy even though the name space is still correctly used as fs.copy. The only change that happened from the previous commit is moving the location of import llnl.util.filesystem as fs to the top of from spack.package import * instead of below it. This reformatting was done however by black

Probably related to the import * statement. They should be avoided as one can easily accidentally override an object that as imported already. (But are common in spack.)