environ-developers / Environ

A fortran package and library for continuum embedding calculations in materials and molecules
http://www.quantum-environ.org
GNU General Public License v2.0
16 stars 5 forks source link

Containerization of Environ build #9

Closed lattice737 closed 3 weeks ago

lattice737 commented 1 month ago

A Dockerfile build of QE with Environ to mitigate "works on my machine" problems by automating the source downloads, dependency installations, and compilation steps in a partitioned, self-contained environment treated as an independent OS. I've arbitrarily chosen an Ubuntu OS, but best practice for a distributed library is to offer multiple OS Dockerfiles. I included SSSP pseudopotential file downloads and Python installation just as a convenience factor; I don't know how helpful those really are for regular Environ use--maybe they could be useful for demos.

I've included instructions in the comments at the start of the file, which should be friendly for folks who are unfamiliar with Docker. I'll include them here for completeness:

Use containerized installation
0) Install Docker (https://docs.docker.com/engine/install/) or Docker Desktop (https://www.docker.com/products/docker-desktop/)
1) Clone repo:     git clone https://github.com/environ-developers/Environ.git && cd Environ
2) Build image:    docker build -t environ-sandbox . && docker image ls
3) Run container:  docker run -i -p 8000:80 environ-sandbox && docker ps
4) Access container:
   a) Docker Desktop
   b) CLI: docker exec -it <container ID> bash

I only attempted to install PW, and total image build time is usually about 5 min for me. After that, it's plug and play. I tried to run a few examples, but it took a long time, and I noticed a few exceptions and warnings I didn't understand:

# ./run_example.sh

/app/q-e/Environ/examples/qe/pw/sccs : starting

This example shows how to use pw.x to calculate the solvation energy 
and other solvent related quantites for a water molecule in water
using a fully self consistent dielectric defined on the electronic
density according to 

   O. Andreussi, I. Dabo and N. Marzari, J. Chem. Phys. 136, 064102 (2012) 

Two equivalent calculations are performed, using two different 
algorithms to obtain the self consistent dielectic as described in 

G. Fisicaro, L. Genovese, O. Andreussi, N. Marzari and S. Goedecker,
               J. Chem. Phys. 144, 014103 (2016)

  executables directory: /app/q-e/bin
  pseudo directory:      /app/q-e/pseudo
  temporary directory:   /app/q-e/tempdir
  checking that needed directories and files exist...
Downloading O.pbe-rrkjus.UPF to /app/q-e/pseudo...
Downloading H.pbe-rrkjus.UPF to /app/q-e/pseudo... done

  running pw.x as:   /app/q-e/bin/pw.x  -nk 1 -nd 1 -nb 1 -nt 1  --environ

  cleaning /app/q-e/tempdir... done
  running the relax calculation in vacuum with direct solver
Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG
 done
  cleaning /app/q-e/tempdir... done
  running the relax calculation in water with fixed-point solver
Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG IEEE_OVERFLOW_FLAG
STOP 1
Error condition encountered during test: exit status = 1
Aborting

I wasn't sure if they were necessarily due to the build or maybe some other issue. I could not run the tests due to a syntax error in the driver:

# ./run-pw.sh
./run-pw.sh: line 35: [: =: unary operator expected

I'm not sure if the test suite and examples are up to date and/or possibly incompatible with the OS or installation configurations I set. I'm hoping you see something I don't and we can tweak this one instance to be used by everyone (or at least the Linux-friendly audience). That in mind, feel free to edit yourself if you would build differently--this is just my best guess (haha).

Anyway, I noticed quite a few questions about installation in the support group, so I'm hoping this helps at least one person with that. It seems to me that is a barrier for some users, so I think there may be a considerable benefit to this approach.

lattice737 commented 1 month ago

I was running tests wrongly before. Test results:

Using executable: /app/q-e/Environ/tests/..//tests/run-pw.sh.
Test id: 240625.
Benchmark: GIT.

pw_confine - pw.in (arg(s): confine-sccs.in): Passed.
pw_confine - pw.in (arg(s): confine-ss.in): Passed.
pw_confine - pw.in (arg(s): confine-system.in): Passed.
pw_dielectric - neutral.in (arg(s): dielectric-sccs-default.in): Passed.
pw_dielectric - neutral.in (arg(s): dielectric-sccs-iter-aux.in): Passed.
pw_dielectric - neutral.in (arg(s): dielectric-sccs-psd.in): Passed.
pw_dielectric - neutral.in (arg(s): dielectric-ss-default.in): **FAILED**.
f1
    ERROR: absolute error 1.03e-02 greater than 1.00e-03. (Test: 0.0018.  Benchmark: 0.0121.)

pw_dielectric - neutral.in (arg(s): dielectric-ss-iter-aux.in): **FAILED**.
f1
    ERROR: absolute error 1.05e-02 greater than 1.00e-03. (Test: 0.0017.  Benchmark: 0.0122.)

pw_dielectric - charged.in (arg(s): dielectric-sccs-default.in): Passed.
pw_dielectric - charged.in (arg(s): dielectric-sccs-iter-aux.in): Passed.
pw_dielectric - charged.in (arg(s): dielectric-sccs-psd.in): Passed.
pw_dielectric - charged.in (arg(s): dielectric-ss-default.in): **FAILED**.
f1
    ERROR: absolute error 6.40e-03 greater than 1.00e-03. (Test: 0.3552.  Benchmark: 0.3488.)

pw_dielectric - charged.in (arg(s): dielectric-ss-iter-aux.in): **FAILED**.
f1
    ERROR: absolute error 6.40e-03 greater than 1.00e-03. (Test: 0.355.  Benchmark: 0.3486.)

pw_dielectric - neutral.in (arg(s): dielectric-sys-default.in): Passed.
pw_dielectric - neutral.in (arg(s): dielectric-sys-iter-aux.in): Passed.
pw_dielectric - charged.in (arg(s): dielectric-sys-default.in): Passed.
pw_dielectric - charged.in (arg(s): dielectric-sys-iter-aux.in): Passed.
pw_electrolyte - pw.in (arg(s): electrolyte-lmpb-sccs-stern_full.in): Passed.
pw_electrolyte - pw.in (arg(s): electrolyte-lmpb-sccs-stern_ions.in): Passed.
pw_electrolyte - pw.in (arg(s): electrolyte-lmpb-ss-stern_full.in): Passed.
pw_electrolyte - pw.in (arg(s): electrolyte-lmpb-ss-stern_ions.in): Passed.
pw_electrolyte - pw.in (arg(s): electrolyte-lpb-sccs.in): Passed.
pw_electrolyte - pw.in (arg(s): electrolyte-lpb-ss.in): Passed.
pw_externals - isolated.in (arg(s): externals_0d_vacuum_default.in): Passed.
pw_externals - isolated.in (arg(s): externals_2d_vacuum_default.in): Passed.
pw_externals - isolated.in (arg(s): externals_0d_dielectric_default.in): Passed.
pw_externals - isolated.in (arg(s): externals_2d_dielectric_default.in): Passed.
pw_externals - slab.in (arg(s): externals_0d_vacuum_default.in): Passed.
pw_externals - slab.in (arg(s): externals_2d_vacuum_default.in): Passed.
pw_externals - slab.in (arg(s): externals_0d_dielectric_default.in): Passed.
pw_externals - slab.in (arg(s): externals_2d_dielectric_default.in): Passed.
pw_field-aware - pw.in (arg(s): fa-ss.in): Passed.
pw_gcs - pw.in (arg(s): gcs-solvent.in): Passed.
pw_gcs - pw.in (arg(s): gcs-vacuum.in): Passed.
pw_ms - pw.in (arg(s): ms-solvent.in): Passed.
pw_ms - pw.in (arg(s): ms-vacuum.in): Passed.
pw_mt - neutral.in (arg(s): dielectric-default.in): Passed.
pw_mt - charged.in (arg(s): dielectric-default.in): Passed.
pw_mt - isolated.in (arg(s): externals_0d_vacuum_default.in): Passed.
pw_mt - isolated.in (arg(s): externals_0d_dielectric_default.in): Passed.
pw_periodic - neutral.in (arg(s): periodic-vacuum-default.in): Passed.
pw_periodic - charged.in (arg(s): periodic-vacuum-default.in): Passed.
pw_periodic - neutral.in (arg(s): periodic-dielectric-default.in): Passed.
pw_periodic - charged.in (arg(s): periodic-dielectric-default.in): Passed.
pw_regions - pw.in (arg(s): insphere-default.in): Passed.
pw_regions - pw.in (arg(s): outslab-default.in): Passed.
pw_slab - neutral.in (arg(s): periodic-vacuum-default.in): Passed.
pw_slab - neutral.in (arg(s): periodic-dielectric-default.in): Passed.
pw_slab - charged.in (arg(s): periodic-vacuum-default.in): Passed.
pw_slab - charged.in (arg(s): periodic-dielectric-default.in): Passed.
pw_solvent-aware - pw.in (arg(s): local-sccs.in): Passed.
pw_solvent-aware - pw.in (arg(s): sa-sccs.in): Passed.
pw_solvent-aware - pw.in (arg(s): local-ss.in): **FAILED**.
f1
    ERROR: absolute error 6.40e-03 greater than 1.00e-03. (Test: 0.0439.  Benchmark: 0.0503.)

pw_solvent-aware - pw.in (arg(s): sa-ss.in): **FAILED**.
f1
    ERROR: absolute error 7.10e-03 greater than 1.00e-03. (Test: 0.0439.  Benchmark: 0.051.)

pw_spin - radical.in (arg(s): vacuum-pcc.in): Passed.
pw_spin - radical.in (arg(s): dielectric-pcc.in): Passed.
pw_surface - pw.in (arg(s): surface-sccs-default.in): Passed.
pw_surface - pw.in (arg(s): surface-sccs-fft.in): Passed.
pw_surface - pw.in (arg(s): surface-ss-default.in): **FAILED**.
f1
    ERROR: absolute error 3.e-03 greater than 1.00e-03. (Test: 0.0305.  Benchmark: 0.0338.)

pw_surface - pw.in (arg(s): surface-ss-fft.in): **FAILED**.
f1
    ERROR: absolute error 3.10e-03 greater than 1.00e-03. (Test: 0.0307.  Benchmark: 0.08.)

pw_surface - pw.in (arg(s): surface-sys-default.in): Passed.
pw_surface - pw.in (arg(s): surface-sys-fft.in): Passed.
pw_volume - pw.in (arg(s): volume-sccs-default.in): Passed.
pw_volume - pw.in (arg(s): volume-ss-default.in): **FAILED**.
f1
    ERROR: absolute error 2.64e-02 greater than 1.00e-03. (Test: 0.0305.  Benchmark: 0.0569.)

pw_volume - pw.in (arg(s): volume-sys-default.in): Passed.
pw_water - neutral.in (arg(s): water-neutral-sccs-default.in): Passed.
pw_water - neutral.in (arg(s): water-neutral-ss-default.in): **FAILED**.
f1
    ERROR: absolute error 3.00e-03 greater than 1.00e-03. (Test: 0.1444.  Benchmark: 0.1414.)

pw_water - cation.in (arg(s): water-cation-sccs-default.in): Passed.
pw_water - cation.in (arg(s): water-cation-ss-default.in): **FAILED**.
f1
    ERROR: absolute error 3.50e-03 greater than 1.00e-03. (Test: 0.58.  Benchmark: 0.3423.)

pw_water - anion.in (arg(s): water-anion-sccs-default.in): Passed.
pw_water - anion.in (arg(s): water-anion-ss-default.in): **FAILED**.
f1
    ERROR: absolute error 1.10e-02 greater than 1.00e-03. (Test: 0.018.  Benchmark: 0.007.)

All done. ERROR: only 59 out of 71 tests passed.
Failed tests in:
    /app/q-e/Environ/tests/pw_dielectric/
    /app/q-e/Environ/tests/pw_solvent-aware/
    /app/q-e/Environ/tests/pw_surface/
    /app/q-e/Environ/tests/pw_volume/
    /app/q-e/Environ/tests/pw_water/
lattice737 commented 1 month ago

Included openmpi (and other goodies) in the build and also ran examples:

# grep -E ": starting|: done" /app/examples.log
/app/q-e/Environ/examples/qe/pw/sccs : starting
/app/q-e/Environ/examples/qe/pw/sccs : done
/app/q-e/Environ/examples/qe/pw/sscs : starting
/app/q-e/Environ/examples/qe/pw/sscs : done
/app/q-e/Environ/examples/qe/pw/pbc : starting
/app/q-e/Environ/examples/qe/pw/pbc : done
/app/q-e/Environ/examples/qe/pw/slab : starting
/app/q-e/Environ/examples/qe/pw/slab : done
/app/q-e/Environ/examples/qe/pw/helmholtz : starting
/app/q-e/Environ/examples/qe/pw/helmholtz : done
/app/q-e/Environ/examples/qe/pw/helmholtz_linpb : starting
/app/q-e/Environ/examples/qe/pw/helmholtz_linpb : done
/app/q-e/Environ/examples/qe/pw/helmholtz_linpb : done
/app/q-e/Environ/examples/qe/pw/helmholtz_linpb : done
/app/q-e/Environ/examples/qe/pw/helmholtz_linpb : done
/app/q-e/Environ/examples/qe/pw/helmholtz_linpb : done
/app/q-e/Environ/examples/qe/pw/mott_schottky : starting
/app/q-e/Environ/examples/qe/pw/mott_schottky : done
/app/q-e/Environ/examples/qe/pw/ms_gcs : starting
/app/q-e/Environ/examples/qe/pw/ms_gcs : done
/app/q-e/Environ/examples/qe/pw/solvent_aware : starting

but I ran into an error running the solvent-aware examples:

/app/q-e/Environ/examples/qe/pw/solvent_aware : starting

This example shows how to use pw.x to simulate a 10xH20 water cluster 
immersed in continuum solvent. By setting the solvent_radius inside the 
BOUNDARY namelist, one activates the solvent-aware interface, see 

O. Andreussi, N.G. Hörmann, F. Nattino, G. Fisicaro, S. Goedecker, and
        N. Marzari, J. Chem. Theory Comput. 15, 1996 (2019).

This feature prevents dielectric from getting inside the cluster.

  executables directory: /app/q-e/bin
  pseudo directory:      /app/q-e/pseudo
  temporary directory:   /app/q-e/tempdir
  checking that needed directories and files exist... done

  running pw.x as:   /app/q-e/bin/pw.x  -nk 1 -nd 1 -nb 1 -nt 1  --environ

  cleaning /app/q-e/tempdir... done
Error condition encountered during test: exit status = 137
Aborting
lattice737 commented 1 month ago

Examples

10/10 examples actually do run successfully with the containerized build. I think my other session running all the examples just timed out:

/app/q-e/Environ/examples/qe/pw/solvent_aware : starting
/app/q-e/Environ/examples/qe/pw/solvent_aware : done
/app/q-e/Environ/examples/qe/pw/field_aware : starting
/app/q-e/Environ/examples/qe/pw/field_aware : done

Continuous Integration

I also configured a CI action that builds environ on commits to master and develop, running tests on pull requests, merges with master, and pushes with v#.# tags. This will help to identify commits that break the build.

CI Issues

  1. I can't seem to run the tests in parallel with CI for some reason. All the tests fail, and the error is about missing values against benchmarks, which is usually connected to an OMPI warning not to run as root. Even with the override flag and environment variables, I can't seem to run mpi with root, which is necessary for Docker containers. For now, CI tests will be run serially 🥲
    
    ...
    pw_water - anion.in (arg(s): water-anion-ss-default.in): **FAILED**.
    Different sets of data extracted from benchmark and test.
    Data only in benchmark: e1, f1, n1.

All done. ERROR: only 0 out of 71 tests passed. ...


2. I wanted to include a job that runs examples, but they take forever! Are there some "essential" examples that we can run in this pipeline? Or maybe it's unnecessary to automate examples like this?
lattice737 commented 1 month ago

Feature Summary

  1. Containerized parallel build of QE PW with Environ using gfortran and OpenMPI on Ubuntu OS a. Option to download SSSP efficiency/precision pseudopotentials b. Option to run tests in parallel during build (logs to /app/tests.log) c. Option to run examples during build (logs to /app/examples.log) d. Python and pipx included
  2. Automatic CI on commit pushes (master and develop branches only) a. Compiles Environ on all commits b. Runs tests with build when:
    • Branch merged with master
    • Commit includes version tag (e.g. v3.1)
    • Pull request updated (this could be refined, but I figured it's better to start broad)

Performance

Examples: 10/10 completed Tests: 59/71 passed

Future Directions

  1. cmake build
  2. All package installations (i.e. make all)
  3. Additional architecture support (i.e. QE-supported architectures)
  4. Additional compiler support (e.g. ifx)
  5. Specific configuration builds (by request)--potentially useful for specific use cases and replicability