kjappelbaum / oximachinerunner

An easy API for using oximachine.
MIT License
7 stars 5 forks source link

cleaner exceptions when input structure is not suitable? #57

Closed ltalirz closed 3 years ago

ltalirz commented 3 years ago

Since we've integrated oximachine into aiida-lsmo it might be thrown different types of structures as input. In binding energy calculations, that might even be a cell with a molecule inside.

Currently, the oximachinerunner exits with a wide variety of different errors that the user then needs to understand and deal with (see below). If one wants to use it on a broad set of structures, this forces the user to use a try with a very general except (or know all possible exceptions in advance).

I would propose to

Here's what I find playing around with oximachinerunner and ASE

In [18]: from ase.build import molecule

In [19]: co2 = molecule('CO2')

In [20]: runner.run_oximachine(co2)
---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
<ipython-input-20-d6279871cd28> in <module>
----> 1 runner.run_oximachine(co2)

~/Personal/Postdoc-MARVEL/repos/oximachinerunner/oximachinerunner/__init__.py in run_oximachine(self, structure)
    243             return self._run_oximachine(structure)
    244         elif isinstance(structure, Atoms):
--> 245             pymatgen_structure = AseAtomsAdaptor.get_structure(structure)
    246             return self._run_oximachine(pymatgen_structure)
    247         elif isinstance(structure, str):

~/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/pymatgen/io/ase.py in get_structure(atoms, cls)
     76
     77         cls = Structure if cls is None else cls
---> 78         return cls(lattice, symbols, positions, coords_are_cartesian=True)
     79
     80     @staticmethod

~/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/pymatgen/core/structure.py in __init__(self, lattice, species, coords, charge, validate_proximity, to_unit_cell, coords_are_cartesian, site_properties)
   3298                 fractional_coords. Defaults to None for no properties.
   3299         """
-> 3300         super().__init__(
   3301             lattice,
   3302             species,

~/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/pymatgen/core/structure.py in __init__(self, lattice, species, coords, charge, validate_proximity, to_unit_cell, coords_are_cartesian, site_properties)
    703
    704             sites.append(
--> 705                 PeriodicSite(
    706                     sp,
    707                     coords[i],

~/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/pymatgen/core/sites.py in __init__(self, species, coords, lattice, to_unit_cell, coords_are_cartesian, properties, skip_checks)
    350
    351         if coords_are_cartesian:
--> 352             frac_coords = lattice.get_fractional_coords(coords)
    353         else:
    354             frac_coords = coords

~/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/pymatgen/core/lattice.py in get_fractional_coords(self, cart_coords)
    168             Fractional coordinates.
    169         """
--> 170         return dot(cart_coords, self.inv_matrix)
    171
    172     def get_vector_along_lattice_directions(self, cart_coords: Vector3Like):

~/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/pymatgen/core/lattice.py in inv_matrix(self)
    135         """
    136         if self._inv_matrix is None:
--> 137             self._inv_matrix = inv(self._matrix)
    138             self._inv_matrix.setflags(write=False)
    139         return self._inv_matrix

<__array_function__ internals> in inv(*args, **kwargs)

~/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/numpy/linalg/linalg.py in inv(a)
    544     signature = 'D->D' if isComplexType(t) else 'd->d'
    545     extobj = get_linalg_error_extobj(_raise_linalgerror_singular)
--> 546     ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
    547     return wrap(ainv.astype(result_t, copy=False))
    548

~/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/numpy/linalg/linalg.py in _raise_linalgerror_singular(err, flag)
     86
     87 def _raise_linalgerror_singular(err, flag):
---> 88     raise LinAlgError("Singular matrix")
     89
     90 def _raise_linalgerror_nonposdef(err, flag):

LinAlgError: Singular matrix
In [24]: co2.set_cell([10,10,10])

In [25]: runner.run_oximachine(co2)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-25-d6279871cd28> in <module>
----> 1 runner.run_oximachine(co2)

~/Personal/Postdoc-MARVEL/repos/oximachinerunner/oximachinerunner/__init__.py in run_oximachine(self, structure)
    244         elif isinstance(structure, Atoms):
    245             pymatgen_structure = AseAtomsAdaptor.get_structure(structure)
--> 246             return self._run_oximachine(pymatgen_structure)
    247         elif isinstance(structure, str):
    248             pymatgen_structure = Structure.from_file(structure)

~/Personal/Postdoc-MARVEL/repos/oximachinerunner/oximachinerunner/__init__.py in _run_oximachine(self, structure)
    273                 metal_indices,
    274                 metal_symbols,
--> 275             ) = self._featurize_single(  # pylint:disable=protected-access
    276                 structure
    277             )

~/Personal/Postdoc-MARVEL/repos/oximachinerunner/oximachinerunner/__init__.py in _featurize_single(self, structure)
    219             Tuple[np.array, list, list]: Feature array, metal indices, metal symbols
    220         """
--> 221         feature_matrix, metal_indices, metals = featurize(structure, self.featureset)
    222         return feature_matrix, metal_indices, metals
    223

~/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/oximachine_featurizer/featurize.py in featurize(structure, featureset)
    347         X.append(feat_dict["feature"])
    348
--> 349     X = np.vstack(X)
    350     (
    351         X,

<__array_function__ internals> in vstack(*args, **kwargs)

~/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/numpy/core/shape_base.py in vstack(tup)
    281     if not isinstance(arrs, list):
    282         arrs = [arrs]
--> 283     return _nx.concatenate(arrs, 0)
    284
    285

<__array_function__ internals> in concatenate(*args, **kwargs)

ValueError: need at least one array to concatenate
In [26]: from ase.io import write

In [27]: write('co2.cif', co2)

In [28]: runner.run_oximachine('./co2.cif')
/Users/leopold/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/pymatgen/io/cif.py:1185: UserWarning: No structure parsed for 1 structure in CIF. Section of CIF file below.
  warnings.warn(
/Users/leopold/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/pymatgen/io/cif.py:1189: UserWarning: data_image0
_cell_length_a   10
_cell_length_b   10
_cell_length_c   10
_cell_angle_alpha   90
_cell_angle_beta   90
_cell_angle_gamma   90
_symmetry_space_group_name_H-M   'P 1'
_symmetry_int_tables_number   1
loop_
 _symmetry_equiv_pos_as_xyz
  'x, y, z'
loop_
 _atom_site_label
 _atom_site_occupancy
 _atom_site_Cartn_x
 _atom_site_Cartn_y
 _atom_site_Cartn_z
 _atom_site_thermal_displace_type
 _atom_site_B_iso_or_equiv
 _atom_site_type_symbol
  C1  1.0000  0.00000  0.00000  0.00000  Biso  1.000  C
  O1  1.0000  0.00000  0.00000  1.17866  Biso  1.000  O
  O2  1.0000  0.00000  0.00000  -1.17866  Biso  1.000  O
  warnings.warn(str(d))
/Users/leopold/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/pymatgen/io/cif.py:1190: UserWarning: Error is '_atom_site_fract_x'.
  warnings.warn("Error is %s." % str(exc))
/Users/leopold/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/pymatgen/io/cif.py:1193: UserWarning: Issues encountered while parsing CIF: '_atom_site_fract_x'
  warnings.warn(
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-28-dd76b68ebbe7> in <module>
----> 1 runner.run_oximachine('./co2.cif')

~/Personal/Postdoc-MARVEL/repos/oximachinerunner/oximachinerunner/__init__.py in run_oximachine(self, structure)
    246             return self._run_oximachine(pymatgen_structure)
    247         elif isinstance(structure, str):
--> 248             pymatgen_structure = Structure.from_file(structure)
    249             return self._run_oximachine(pymatgen_structure)
    250         elif isinstance(structure, os.PathLike):

~/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/pymatgen/core/structure.py in from_file(cls, filename, primitive, sort, merge_tol)
   2516             contents = f.read()
   2517         if fnmatch(fname.lower(), "*.cif*") or fnmatch(fname.lower(), "*.mcif*"):
-> 2518             return cls.from_str(
   2519                 contents, fmt="cif", primitive=primitive, sort=sort, merge_tol=merge_tol
   2520             )

~/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/pymatgen/core/structure.py in from_str(cls, input_string, fmt, primitive, sort, merge_tol)
   2451         if fmt == "cif":
   2452             parser = CifParser.from_string(input_string)
-> 2453             s = parser.get_structures(primitive=primitive)[0]
   2454         elif fmt == "poscar":
   2455             s = Poscar.from_string(input_string, False, read_velocities=False).structure

~/Applications/miniconda3/envs/aiida-lsmo/lib/python3.8/site-packages/pymatgen/io/cif.py in get_structures(self, primitive)
   1195             )
   1196         if len(structures) == 0:
-> 1197             raise ValueError("Invalid cif file with no structures!")
   1198         return structures
   1199

ValueError: Invalid cif file with no structures!
ltalirz commented 3 years ago

P.S. @mpougin reported the following (which however looks like an import error; she'll let you know whether she has the latest version installed)

...
  File "/home/miriam/aiida1/aiida-lsmo/aiida_lsmo/calcfunctions/oxidation_state.py", line 5, in <module>
    import oximachinerunner
  File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/oximachinerunner/__init__.py", line 15, in
    from oximachine_featurizer import featurize
  File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/oximachine_featurizer/__init__.py", line 9, in <module>
    from .featurize import FeatureCollector, GetFeatures, featurize
  File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/oximachine_featurizer/featurize.py", line 16, in <module>
    from matminer.featurizers.site import CrystalNNFingerprint, GaussianSymmFunc
  File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/matminer/featurizers/site.py", line 1670, in <module>
    class LocalPropertyDifference(BaseFeaturizer):
  File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/matminer/featurizers/site.py", line 1695, in LocalPropertyDifference
    def __init__(self, data_source=MagpieData(), weight='area',
  File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/matminer/utils/data.py", line 215, in __init__
    prop_value = float(lines[atomic_no - 1])
IndexError: list index out of range
ltalirz commented 3 years ago

P.S. I see now that the ValueError is already document, which is great.

mpougin commented 3 years ago

P.S. @mpougin reported the following (which however looks like an import error; she'll let you know whether she has the latest version installed)

...
  File "/home/miriam/aiida1/aiida-lsmo/aiida_lsmo/calcfunctions/oxidation_state.py", line 5, in <module>
    import oximachinerunner
  File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/oximachinerunner/__init__.py", line 15, in
    from oximachine_featurizer import featurize
  File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/oximachine_featurizer/__init__.py", line 9, in <module>
    from .featurize import FeatureCollector, GetFeatures, featurize
  File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/oximachine_featurizer/featurize.py", line 16, in <module>
    from matminer.featurizers.site import CrystalNNFingerprint, GaussianSymmFunc
  File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/matminer/featurizers/site.py", line 1670, in <module>
    class LocalPropertyDifference(BaseFeaturizer):
  File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/matminer/featurizers/site.py", line 1695, in LocalPropertyDifference
    def __init__(self, data_source=MagpieData(), weight='area',
  File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/matminer/utils/data.py", line 215, in __init__
    prop_value = float(lines[atomic_no - 1])
IndexError: list index out of range

Thanks for forwarding my problem @ltalirz . I have release 1.3.0 installed, that's the latest version I think. And for me, the CP2K binding energy calculations (gas molecule in MOF framework) fail with the above mentioned error message

kjappelbaum commented 3 years ago

Can you maybe send the CIF? I agree with Leo that we could throw better errors and document them but one would expect that it throws some kind of error for CO2.

I could make some NoMetal, InvalidCIF, and some other errors that you could catch.

On Feb 6, 2021 at 23:15, mpougin notifications@github.com<mailto:notifications@github.com> wrote:

P.S. @mpouginhttps://github.com/mpougin reported the following (which however looks like an import error; she'll let you know whether she has the latest version installed)

... File "/home/miriam/aiida1/aiida-lsmo/aiida_lsmo/calcfunctions/oxidation_state.py", line 5, in import oximachinerunner File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/oximachinerunner/init.py", line 15, in from oximachine_featurizer import featurize File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/oximachine_featurizer/init.py", line 9, in from .featurize import FeatureCollector, GetFeatures, featurize File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/oximachine_featurizer/featurize.py", line 16, in from matminer.featurizers.site import CrystalNNFingerprint, GaussianSymmFunc File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/matminer/featurizers/site.py", line 1670, in class LocalPropertyDifference(BaseFeaturizer): File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/matminer/featurizers/site.py", line 1695, in LocalPropertyDifference def init(self, data_source=MagpieData(), weight='area', File "/home/miriam/anaconda3/envs/aiida1/lib/python3.6/site-packages/matminer/utils/data.py", line 215, in init prop_value = float(lines[atomic_no - 1]) IndexError: list index out of range

Thanks for forwarding my problem @ltalirzhttps://github.com/ltalirz . I have release 1.3.0 installed, that's the latest version I think. And for me, the CP2K binding energy calculations (gas molecule in MOF framework) fail with the above mentioned error message

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/kjappelbaum/oximachinerunner/issues/57#issuecomment-774551509, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AH3I2QPSD6NSUEDDHBEICMDS5W5OXANCNFSM4XFHNREA.

mpougin commented 3 years ago

Here the CIFs for CO2 and the MOF. I also experienced errors when I tried ethane and ethylene CO2.txt MOF.txt

kjappelbaum commented 3 years ago

Yes, it will only work if there's a metal. I'll clarify this in the docs and throw a specific error.

On Feb 6, 2021 at 23:31, mpougin notifications@github.com<mailto:notifications@github.com> wrote:

Here the CIFs for CO2 and the MOF. I also experienced errors when I tried ethane and ethylene CO2.txthttps://github.com/kjappelbaum/oximachinerunner/files/5937792/CO2.txt MOF.txthttps://github.com/kjappelbaum/oximachinerunner/files/5937793/MOF.txt

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/kjappelbaum/oximachinerunner/issues/57#issuecomment-774553279, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AH3I2QLX7PPVWT5EGSBLQELS5W7M7ANCNFSM4XFHNREA.

kjappelbaum commented 3 years ago

https://github.com/kjappelbaum/oximachinerunner/pull/59 would introduce some custom exceptions. @ltalirz note that there are indeed different reasons why your examples fail:

The last and the first case would be caught now using ParsingError and a possible fallback by the user would be to provide a valid pymatgen structure. The second case would raise a NoMetalError for which there is no fallback. The current version of oximachine is simply not built to work on non-Metals or assign all oxidation states in a self-consistent way (though I plan/hope to do this....)

kjappelbaum commented 3 years ago

And from the CI i just saw that the behavior of the last case will depend on the pymatgen version ...

mpougin commented 3 years ago

Thank you @kjappelbaum. So to run the cp2k_multistage "standard" protocol I have to switch the initial_magnetization to element, as this won't run the oxidation state calculations. Is this correct @ltalirz? And how do I make this changes? Just locally on my cloned rep?

kjappelbaum commented 3 years ago

I think aiida-lsmo should only call oximachine if there is a metal. I can make a PR for that, but not today.

ltalirz commented 3 years ago

Thanks for the fix @kjappelbaum !

So to run the cp2k_multistage "standard" protocol I have to switch the initial_magnetization to element, as this won't run the oxidation state calculations. Is this correct @ltalirz? And how do I make this changes? Just locally on my cloned rep?

Yes, for the moment just do that, just make the change locally.

I think aiida-lsmo should only call oximachine if there is a metal.

Is this the only test that should be applied? (will it work correctly for an isolated ion?) Or would it perhaps be simpler for aiida-lsmo to just do a try/except and return empty prediction if any of the known exceptions are raised?

kjappelbaum commented 3 years ago

Is this the only test that should be applied? (will it work correctly for an isolated ion?)

if it is a valid pymatgen Structure or converted to one, yes. If there is not cell pymatgen will complain and we will raise a ParsingError. And it would actually be interesting what happens for an isolated metal as this is a strange situation for the model that it has never seen before ... I would guess that this will raise a warning as soon as i implemented https://github.com/kjappelbaum/oximachinerunner/issues/12

kjappelbaum commented 3 years ago

is it fine that i closed this or is there something i still should do?

ltalirz commented 3 years ago

all good, thanks @kjappelbaum !