Open jtkrogel opened 5 days ago
Ready to proceed.
could you suggest a reviewer?
Yes, already flagged @anbenali for this.
Yes, already flagged @anbenali for this.
Oh thanks. I missed it.
c4q does not complete successfully for me in both example cases. Additionally, Nexus doesn't notice the error and keeps checking status, this time overnight. (I have seen this on other occasions, so my guess is that there is some changed handling or error messages or signals that are not being caught by the workstation/"wsNN" infrastructure, perhaps with openmpi runs). These were run on nitrogen2 with the nightly test configuration for gcc "new"+openmpi. i.e. reasonably new versions of all software including python (3.11.9) installed via spack. Note the broken scf.h5 link. PySCF is 2.5.0 in this case. Happy to poke further -- this could well be completely unrelated to Nexus and something to do with the converter or a PySCF version dependency etc.
nohup: ignoring input
_____________________________________________________
Nexus 2.1.0
(c) Copyright 2012- Nexus developers
Please cite:
J. T. Krogel Comput. Phys. Commun. 198 154 (2016)
https://doi.org/10.1016/j.cpc.2015.08.012
_____________________________________________________
Checking for Nexus dependencies on the current machine...
Nexus dependencies available on current machine:
python3 = 3.11.9 (required)
numpy = 1.26.4 (required)
scipy = 1.13.1 (optional)
h5py = 3.11.0 (optional)
matplotlib = (unknown) (optional)
pydot = 1.4.2 (optional)
spglib = 2.0.2 (optional)
seekpath = 2.0.1 (optional)
pycifrw = (unknown) (optional)
Nexus dependencies recommended for full functionality:
python3 = 3.6.0 (required)
numpy = 1.13.1 (required)
scipy = 0.19.1 (optional)
h5py = 2.7.1 (optional)
matplotlib = 2.0.2 (optional)
pydot = 1.2.3 (optional)
spglib = 1.9.9 (optional)
seekpath = 1.4.0 (optional)
pycifrw = 4.3.0 (optional)
cif2cell = 1.2.10 (optional)
All required Nexus dependencies are met.
Core workflow features should work.
Some optional features may not.
See below for more information.
Some optional dependencies are missing or merit an update.
These modules are not needed for core workflow operation.
Optional features related to outdated modules may still work.
Please install updated versions if problems are encountered.
Optional dependencies that are missing:
cif2cell is missing. Install 1.2.10 or greater.
Optional dependencies benefitting from user check or update:
matplotlib version is unknown. Check for 2.0.2 or greater.
pycifrw version is unknown. Check for 4.3.0 or greater.
Applying user settings
Pseudopotentials
reading pp: ../../pseudopotentials/C.BFD.upf
reading pp: ../../pseudopotentials/C.BFD.xml
reading pp: ../../pseudopotentials/H.BFD.upf
reading pp: ../../pseudopotentials/H.BFD.xml
reading pp: ../../pseudopotentials/O.BFD.upf
reading pp: ../../pseudopotentials/O.BFD.xml
Project starting
checking for file collisions
loading cascade images
cascade 0 checking in
checking cascade dependencies
all simulation dependencies satisfied
starting runs:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
elapsed time 0.0 s memory 117.18 MB
Entering ./runs/diamond_ta/scf 0
writing input files 0 scf
Entering ./runs/diamond_ta/scf 0
sending required files 0 scf
submitting job 0 scf
Entering ./runs/diamond_ta/scf 0
Executing:
export OMP_NUM_THREADS=16
python3 scf.py
elapsed time 3.0 s memory 319.44 MB
elapsed time 6.1 s memory 1451.97 MB
elapsed time 9.1 s memory 1606.60 MB
elapsed time 12.1 s memory 2024.05 MB
elapsed time 15.1 s memory 1619.60 MB
elapsed time 18.1 s memory 1745.74 MB
elapsed time 21.1 s memory 1402.56 MB
elapsed time 24.2 s memory 1141.36 MB
elapsed time 27.2 s memory 2076.82 MB
elapsed time 30.2 s memory 1839.52 MB
elapsed time 33.2 s memory 1645.86 MB
elapsed time 36.2 s memory 1993.04 MB
(many lines deleted)
elapsed time 1223.2 s memory 117.18 MB
Entering ./runs/diamond_ta/scf 0
copying results 0 scf
Entering ./runs/diamond_ta/scf 0
analyzing 0 scf
elapsed time 1226.3 s memory 117.18 MB
Entering ./runs/diamond_ta/scf 1
writing input files 1 c4q
Entering ./runs/diamond_ta/scf 1
sending required files 1 c4q
submitting job 1 c4q
Entering ./runs/diamond_ta/scf 1
Executing:
export OMP_NUM_THREADS=1
mpirun -np 1 convert4qmc -prefix c4q -orbitals scf.h5
elapsed time 1229.3 s memory 117.18 MB
Entering ./runs/diamond_ta/scf 1
copying results 1 c4q
Entering ./runs/diamond_ta/scf 1
analyzing 1 c4q
elapsed time 1232.3 s memory 117.18 MB
elapsed time 1235.4 s memory 117.18 MB
elapsed time 1238.4 s memory 117.18 MB
elapsed time 1241.4 s memory 117.18 MB
elapsed time 1244.4 s memory 117.18 MB
elapsed time 1247.4 s memory 117.18 MB
elapsed time 1250.4 s memory 117.18 MB
(many lines deleted)
elapsed time 60547.2 s memory 117.18 MB
elapsed time 60550.2 s memory 117.18 MB
elapsed time 60553.2 s memory 117.18 MB
elapsed time 60556.2 s memory 117.18 MB
elapsed time 60559.3 s memory 117.18 MB
$ pwd; ls -l
.. /qmcpack/nexus/examples/qmcpack/rsqmc_pyscf/02_diamond_hf_qmc/runs/diamond_ta/scf
total 240
-rw-r--r-- 1 pk7 users 1021 Jul 2 17:26 c4q.err
-rw-r--r-- 1 pk7 users 40 Jul 2 17:26 c4q.in
lrwxrwxrwx 1 pk7 users 6 Jul 2 17:26 c4q.orbs.h5 -> scf.h5
-rw-r--r-- 1 pk7 users 79 Jul 2 17:26 c4q.out
-rw-r--r-- 1 pk7 users 1252 Jul 2 17:25 scf.err
-rw-r--r-- 1 pk7 users 139630 Jul 2 17:25 scf.out
-rw-r--r-- 1 pk7 users 1885 Jul 2 17:05 scf.py
-rw-r--r-- 1 pk7 users 360 Jul 2 17:05 scf.struct.xsf
-rw-r--r-- 1 pk7 users 175 Jul 2 17:05 scf.struct.xyz
-rw-r--r-- 1 pk7 users 69792 Jul 2 17:25 scf.twistnum_000.h5
drwxr-xr-x 2 pk7 users 52 Jul 2 17:26 sim_c4q
drwxr-xr-x 2 pk7 users 52 Jul 2 17:26 sim_scf
c4q.err
Could not open H5 file
[nitrogen2:3440447] *** Process received signal ***
[nitrogen2:3440447] Signal: Aborted (6)
[nitrogen2:3440447] Signal code: (-6)
[nitrogen2:3440447] [ 0] /lib64/libc.so.6(+0x3e6f0)[0x7f162b23e6f0]
[nitrogen2:3440447] [ 1] /lib64/libc.so.6(+0x8b94c)[0x7f162b28b94c]
[nitrogen2:3440447] [ 2] /lib64/libc.so.6(raise+0x16)[0x7f162b23e646]
[nitrogen2:3440447] [ 3] /lib64/libc.so.6(abort+0xd3)[0x7f162b2287f3]
[nitrogen2:3440447] [ 4] convert4qmc[0x47a2f3]
[nitrogen2:3440447] [ 5] convert4qmc[0x41ab59]
[nitrogen2:3440447] [ 6] /lib64/libc.so.6(+0x29590)[0x7f162b229590]
[nitrogen2:3440447] [ 7] /lib64/libc.so.6(__libc_start_main+0x80)[0x7f162b229640]
[nitrogen2:3440447] [ 8] convert4qmc[0x422db5]
[nitrogen2:3440447] *** End of error message ***
--------------------------------------------------------------------------
prterun noticed that process rank 0 with PID 3440447 on node nitrogen2 exited on
signal 6 (Aborted).
--------------------------------------------------------------------------
For the nexus test failure, please add two print statements after this line to investigate:
1452: File "/home/pk7/projects/qmc/git_QMCPACK_prckent/qmcpack/nexus/tests/unit/test_pyscf_input.py", line 577, in test_write
1452: assert(text_eq(text,ref_text))
# add these
print(ref_text)
print(text)
For the examples added with this PR, diamond_pp_hf_twistavg_prim.py
(primitive cell twist averaging) should run cleanly (at least it did for me -- please post the converter output), while diamond_pp_hf_twistavg.py
(supercell twist averaging) currently fails at the converter level due to changes needed to Anouar's savetoqmcpack
.
These features have bug-fixes needed at the QMCPACK/QMCPACK-converter levels (I've made Anouar aware already). This PR implements the Nexus-side features needed to drive these workflows, but does not guarantee that QMCPACK and its converters function properly.
Thanks Jaron -- the situation is clear to me now. @anbenali How far off are the updates to savetoqmcpack? I put PySCF 2.6.2 in spack so can easily test the latest version. In would be nice to have patch so that Nexus support can be tested, but perhaps there are puzzles to solve?
Proposed changes
This PR adds support for arbitrary supercell twists/twist grids in workflows involving PySCF and QMCPACK.
This PR is now ready for review.
What type(s) of changes does this code introduce?
Does this introduce a breaking change?
What systems has this change been tested on?
Laptop, Improv at ALCF
Checklist
Path out of WIP