QMCPACK / qmcpack

Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance portable GPU support
http://www.qmcpack.org
Other
298 stars 139 forks source link

Improve Error Handling of HDF5 Files (in the event of a missing or corrupted file). #172

Closed rcclay closed 7 years ago

rcclay commented 7 years ago

Currently, if an HDF5 wavefunction file doesn't exist, error "handling" is done by letting the HDF5 C routine throw a fit. Then QMCPACK continues until "Fatal Error. Aborting at ParticleSet::resetGroups() Failed. No species exisits". Instead, QMCPACK should check for file existence and gracefully terminate if not found.

Here's an example excerpt of output in all its horrifying glory:

DO NOT READ DENSITY Offset for the random number seeds based on time 407> Random number offset = 407 seeds = 2803-3083 2803 2819 2833 2837 2843 2851 2857 2861 2879 2887 2897 2903 2909 2917 2927 2939 2953 2957 2963 2969 2971 2999 3001 3011 3019 3023 3037 3041 3049 3061 3067 3079 3083 3089 Random seeds Node = 0: 2819 2833 2837 2843 2851 2857 2861 2879 2887 2897 2903 2909 2917 2927 2939 2953 2957 2963 2969 2971 2999 3001 3011 3019 3023 3037 3041 3049 3061 3067 3079 3083 HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 0:

000: ../../src/H5F.c line 604 in H5Fopen(): unable to open file

major: File accessibilty minor: Unable to open file

001: ../../src/H5Fint.c line 992 in H5F_open(): unable to open file: time = Thu Apr 6 10:15:19 2017

, name = 'doop.pwscf.h5', tent_flags = 0 major: File accessibilty minor: Unable to open file

002: ../../src/H5FD.c line 993 in H5FD_open(): open failed

major: Virtual File Layer minor: Unable to initialize object

003: ../../src/H5FDsec2.c line 339 in H5FD_sec2_open(): unable to open file: name = 'doop.pwscf.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0

major: File accessibilty minor: Unable to open file HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 0:

000: ../../src/H5Ddeprec.c line 244 in H5Dopen1(): not a location

major: Invalid arguments to routine minor: Inappropriate type

001: ../../src/H5Gloc.c line 253 in H5G_loc(): invalid object ID

major: Invalid arguments to routine minor: Bad value HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 0:

000: ../../src/H5Dio.c line 140 in H5Dread(): not a dataset

major: Invalid arguments to routine minor: Inappropriate type HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 0:

000: ../../src/H5D.c line 415 in H5Dclose(): not a dataset

major: Invalid arguments to routine minor: Inappropriate type HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 0:

000: ../../src/H5Ddeprec.c line 244 in H5Dopen1(): not a location

major: Invalid arguments to routine minor: Inappropriate type

001: ../../src/H5Gloc.c line 253 in H5G_loc(): invalid object ID

major: Invalid arguments to routine minor: Bad value HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 0:

000: ../../src/H5Dio.c line 140 in H5Dread(): not a dataset

major: Invalid arguments to routine minor: Inappropriate type HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 0:

000: ../../src/H5D.c line 415 in H5Dclose(): not a dataset

major: Invalid arguments to routine minor: Inappropriate type HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 0:

000: ../../src/H5Ddeprec.c line 244 in H5Dopen1(): not a location

major: Invalid arguments to routine minor: Inappropriate type

001: ../../src/H5Gloc.c line 253 in H5G_loc(): invalid object ID

major: Invalid arguments to routine minor: Bad value HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 0:

000: ../../src/H5Dio.c line 140 in H5Dread(): not a dataset

major: Invalid arguments to routine minor: Inappropriate type HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 0:

000: ../../src/H5D.c line 415 in H5Dclose(): not a dataset

major: Invalid arguments to routine minor: Inappropriate type Fatal Error. Aborting at ParticleSet::resetGroups() Failed. No species exisits

prckent commented 7 years ago

Agree. All the h5opens need error trapping. #124 handled one case.

Which HDF5 input was this?

rcclay commented 7 years ago

This is the wavefunction h5 file that is generated by pw2qmcpack.x . I expect error trapping will be needed in both the ParticleSet initialization and plane wave coefficient reading sections of the code.

prckent commented 7 years ago

What is your input file?

EinsplineSetBuilder::ReadOrbitalInfo() in src/QMCWaveFunctions/EinsplineSetBuilderOld.cpp will abort, but clearly your input triggers a different read.

rcclay commented 7 years ago

Ah yes. I've attached the input. If your current test cases have a <particleset> tag, then I suspect the reason my example is not getting trapped is because particleset initialization will be done through the src/ParticleIO/ESHDFParticleParser.cpp classes, and therefore won't make it to EinsplineSet initialization.

<?xml version="1.0"?> <simulation xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.mcc.uiuc.edu/qmc/schema/molecu.xsd"> <project id="Li2O-S1.tw0" series="1"> <application name="qmcapp" role="molecu" class="serial" version="0.2"> DMC for Li2O-S1-tw0 </application> </project>

<random seed="-1"/> <qmcsystem> <wavefunction name="psi0" target="e"> <determinantset type="bspline" href="Li2O.pwscf.h5" twistnum="0" gpu="no" meshfactor="1.00" precision="double" source="i" target="e"> <basisset/> <slaterdeterminant> <determinant id="updet" size="6" ref="updet"> <occupation mode="ground" spindataset="0"> </occupation> </determinant> <determinant id="downdet" size="6" ref="downdet"> <occupation mode="ground" spindataset="0"> </occupation> </determinant> </slaterdeterminant> </determinantset> <jastrow name="RPA" type="Two-Body" function="yukawa" print="yes"> </jastrow> <jastrow name="J2" type="Two-Body" function="Bspline" print="yes"> <correlation speciesA="u" speciesB="u" size="8" cusp="-0.25"> <coefficients id="uu" type="Array"> 0 0 0 0 0 0 0 0</coefficients> </correlation> <correlation speciesA="u" speciesB="d" size="8" cusp="-0.5"> <coefficients id="ud" type="Array"> 0 0 0 0 0 0 0 0</coefficients> </correlation> </jastrow> <jastrow name="J1" type="One-Body" function="Bspline" print="yes" source="i"> <correlation elementType="Li" cusp="0.0" size="8"> <coefficients id="Li" type="Array"> 0.03469937486 -0.4525693399 -0.4228072048 -0.2157836594 -0.2258616984 -0.08141232398 -0.05159100189 -0.01639514784</coefficients> </correlation> <correlation elementType="O" cusp="0.0" size="8"> <coefficients id="O" type="Array"> -0.8979689281 -0.8263737835 -0.7092162452 -0.5550774556 -0.3983654622 -0.2474658765 -0.1397257949 -0.04811176115</coefficients> </correlation> </jastrow> <jastrow name="J1S" type="One-Body" function="Bspline" print="yes" source="i"> <correlation elementType="Li" cusp="3.000000" size="8" rcut="0.5"> <coefficients id="Li-sr" type="Array"> -0.834681609 -0.6720791154 -0.5173579154 -0.3757252509 -0.253059255 -0.1525924541 -0.07546805044 -0.02713168583</coefficients> </correlation> </jastrow> </wavefunction> </qmcsystem> <hamiltonian name="h0" type="generic" target="e"> <pairpot type="pseudo" name="PseudoPot" source="i" wavefunction="psi0" format="xml"> <pseudo elementType="O" href="O.xml"/> <pseudo elementType="Li"> <header symbol="Li" atomic-number="3" zval="3" /> <local> <grid type="linear" ri="0.0" rf="4.0" npts="201" /> </local> </pseudo> </pairpot> <constant name="IonIon" type="coulomb" source="i" target="i"/> <pairpot name="ElecElec" type="coulomb" source="e" target="e" physical="true"/> </hamiltonian>

<qmc method='wftest'/> </simulation>

rcclay commented 7 years ago

Pull request #176 should fix this.

prckent commented 7 years ago

Closed by #176