Open mattwthompson opened 3 years ago
Might be useful to add something like ConformerGenerationError
to catch #822. A patch would start with something like
56d053cfe105ca42623757751e40004541e56e7c
Another one-off "could use a new exception here" would fix the confusing error (UnassignedValenceParameterException
) thrown when no vdW parameter matches were found. There's already a hierarchy, but this one could probably be re-named to something more strictly appropriate.
In:
from openff.toolkit.typing.engines.smirnoff import ForceField
from openff.toolkit.topology import Molecule, Topology
top = Molecule.from_smiles("C").to_topology()
parsley = ForceField("test_forcefields/test_forcefield.offxml")
parsley['vdW']._parameters = []
parsley.create_openmm_system(top)
Out:
Traceback (most recent call last):
File "vdw_repro.py", line 9, in <module>
parsley.create_openmm_system(top)
File "/Users/mwt/software/openforcefield/openff/toolkit/typing/engines/smirnoff/forcefield.py", line 1325, in create_openmm_system
parameter_handler.create_force(system, topology, **kwargs)
File "/Users/mwt/software/openforcefield/openff/toolkit/typing/engines/smirnoff/parameters.py", line 3656, in create_force
assigned_terms=atom_matches, valence_terms=list(topology.topology_atoms)
File "/Users/mwt/software/openforcefield/openff/toolkit/typing/engines/smirnoff/parameters.py", line 2466, in _check_all_valence_terms_assigned
raise exception
openff.toolkit.typing.engines.smirnoff.parameters.UnassignedValenceParameterException: vdWHandler was not able to find parameters for the following valence terms:
- Topology indices (0,): names and elements ( C),
- Topology indices (1,): names and elements ( H),
- Topology indices (2,): names and elements ( H),
- Topology indices (3,): names and elements ( H),
- Topology indices (4,): names and elements ( H),
There are also some cases in which an custom exception is re-used in ways that could be reviewed. For example, trying to use "Coulomb" electrostatics as specified in the SMIRNOFF spec raises a IncompatibleParameterError
when the issue is not incompatibility but that it's not implemented yet
Also may be some new functionality on the horizon: https://www.python.org/dev/peps/pep-0654/#except
Another detail to consider: ThingWentWrongError
vs. ThingWentWrongException
. I don't think I am consistent at all. I would like to unify around one option and avoid the other.
Should these exceptions be grouped into a single file (maybe openforcefield/exceptions.py) or, as it stands now, scattered across files closer to where they'd be raised, or something in between?
If you go down this path, exceptions.py
sounds like a great idea -- otherwise anyone importing your library and trying to deal with errors downstream has to go grep hunting.
I'm not sure why using in-built exceptions is not ideal, though. Could you explain? My Python experience is patchy.
Please let me know if this should be a separate issue. On the topic of custom exceptions -- why is MessageException?
Last week during a session with @j-wags we wondered why the MessageException
class exists. What does it do that a base Exception class does not?
from openff.toolkit.utils.utils import MessageException
def message():
raise MessageException("my message")
def base():
raise Exception("my message")
>>> message()
MessageException Traceback (most recent call last)
<ipython-input-6-c7233cf6135f> in <module>
----> 1 message()
<ipython-input-5-50db2ce2786b> in message()
1 def message():
----> 2 raise MessageException("my message")
3
4 def base():
5 raise Exception("my message")
MessageException: my message
>>> base()
Exception Traceback (most recent call last)
<ipython-input-7-cbd7a77bf541> in <module>
----> 1 base()
<ipython-input-5-50db2ce2786b> in base()
3
4 def base():
----> 5 raise Exception("my message")
Exception: my message
Also, @mattwthompson has raised some interesting considerations in the intro. No one else has commented so throwing in my opinions here.
How much we can break the API doing this? A function raising a different exception may not be an issue to our downstream users, or it may be a big problem.
It is debatable about the errors you have which are just Exceptions, but if you subclass the general error type it should not break anything.
Would inheriting custom exceptions from multiple built-in classes is worth it?
I don't see why not, so long as the superclasses make sense?
Should these exceptions be grouped into a single file (maybe openforcefield/exceptions.py) or, as it stands now, scattered across files closer to where they'd be raised, or something in between?
Absolutely.
Should we have an exception hierarchy in which some of our custom exceptions inherit from each other? How deep should such a tree go?
Absolutely.
You could subclass all your custom exceptions from Message/Exception
, but IMO that makes it unnecessarily difficult for downstream users to catch them with try/except
because they have to work out which file to import the custom error from, import it, explicitly list the caught exceptions, etc. As custom exceptions multiply this can be be a real chore that mostly gets done by waiting to see what exception you get so you can catch it, as many user-facing methods do not document the possible errors that get raised. For example, Molecule.assign_partial_charges
documents that it may raise an InvalidToolkitRegistryError
, but neglects the custom errors raised by the various toolkits e.g. ChargeMethodUnavailableError, IncorrectNumConformersError, ChargeCalculationError, AntechamberNotFoundError
(this last one is not documented anywhere, actually).
Subclassing custom exceptions from existing errors and creating exception trees gets around a lot of this. For example, these existing exceptions:
class MissingPackageError(MessageException):
"""This function requires a package that is not installed."""
class ToolkitUnavailableException(MessageException):
"""The requested toolkit is unavailable."""
# TODO: Allow toolkit to be specified and used in formatting/printing exception.
class LicenseError(ToolkitUnavailableException):
"""This function requires a license that cannot be found."""
class AntechamberNotFoundError(MessageException):
"""The antechamber executable was not found"""
could be
class MissingPackageError(ModuleNotFoundError): # This is actually the point of ModuleNotFoundError
"""This function requires a package that is not installed."""
class ToolkitUnavailableException(MissingPackageError):
"""The requested toolkit is unavailable."""
# TODO: Allow toolkit to be specified and used in formatting/printing exception.
class LicenseError(ToolkitUnavailableException):
"""This function requires a license that cannot be found."""
class AntechamberNotFoundError(ModuleNotFoundError):
"""The antechamber executable was not found"""
and users who don't care about minutiae can catch them all with except ModuleNotFoundError
, or even ImportError
.
Edit: and I'm still not sure about this reasoning for so many different custom exceptions.
We should have our own exceptions that are fairly specific and provide detailed feedback to the user as to what went wrong and why, and possibly hints at how to resolve the error. For example, this behavior (currently in the code) checks all the boxes:
That is surely what the message raised by the error is for. So long as the ModuleNotFoundError
explains which module is missing and possible reasons for it, I'm not sure I care which of MissingPackageError, ToolkitUnavailableException, AntechamberNotFoundError
I get.
Thanks for the feedback!
why does
MessageException
exist?
This has been in the codebase for ages (I think back when the PIs were the only developers) and I have always been confused by it as well. Things were different back in ~2017 ... maybe there's some Python 2 compatibility that I have never come across? Maybe it was part of a more elaborate idea that never got fully implemented. I don't know what it does that the built-in exceptions don't do, and therefore I'd like to ditch it.
Your last code snippet (with class ToolkitUnavailableException(MissingPackageError):
) is exactly the design I'm hoping can be implemented here.
I agree that each docstring, on average, does a poor job of communicating which exceptions can be raised. I'd like to say that we could go through everything and make a good effort to list out everything that could be raised by any function/method call, but that's probably a significant amount of work to wrangle up and a decent amount of work to keep things in sync over time. I'm even skeptical that it can be done well manually - I wonder if there are automated solutions for this?
I'm still not sure about this reasoning for so many different custom exceptions. ... So long as the
ModuleNotFoundError
explains which module is missing and possible reasons for it, I'm not sure I care which ofMissingPackageError
,ToolkitUnavailableException
,AntechamberNotFoundError
I get.
I think the motivation here is similar to why we'd want to do custom exceptions to begin with; most users will reach for a somewhat broad exception (i.e. one that broadly captures molecule parsing errors) like you describe but some specific cases might require dealing with one or more of the more detailed exceptions. In cases like those, it's more useful to have a sub(-sub-sub...)classed exception that handles one extremely specific case, even if the contents of the message are similar, just like it's better to have a ToolkitUnavailableException
in addition to ModuleNotFoundError
. I would assume that really specific exceptions (maybe one for RDKit's SMILES parsing failing for reason xyz) are mostly used by developers/powerusers and not of much interest to most users.
Per @mattwthompson and my discussion in today's developers coffee: As a next step, let's
exceptions.py
OpenForceFieldToolkitException
class, which all exceptions inherit fromMessageException
raise an error telling people to use OpenForceFieldToolkitException
insteadAs a smaller task, I think it would be nice to remove some of the assert
lines in the non-testing parts of the toolkit. For example,
>>> from openff.toolkit.topology import Molecule
>>> top = Molecule.from_smiles("CCO").to_topology()
>>> top.atom(top.n_topology_atoms + 1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mwt/miniconda3/envs/openff-system/lib/python3.8/site-packages/openff/toolkit/topology/topology.py", line 2934, in atom
assert 0 <= atom_topology_index < self.n_topology_atoms
AssertionError
this isn't very useful, and could be replaced by a InvalidAtomIndexError
with a descriptive message instead of displaying the internal check and forcing the user to do into what went wrong. (In my case I was just sloppy with atom indexing vs GROMACS, but the point remains)
After #1021,
openff/toolkit/utils/exceptions.py
OpenForceFieldToolkitException
class which all exceptions inherit fromMessageException
now raise an deprecation warningMessageException
will be removed in January 2022This will all presumably go into release v0.10.1, which does not currently have a timeline.
The remaining work is more or less going through the above table and moving built-in exceptions into more specific ones. I'm ambivalent as to whether this should be done in one big swing or slowly over time.
Is #986 on the radar of this PR or should I open a PR? It currently raises vanilla Key- and IndexErrors.
... on the radar of this PR
I probably missed it when I made this survey however many months ago
or should I open a PR?
I think so, yes
After #1021,
* All exceptions are consolidating into a single file `openff/toolkit/utils/exceptions.py` * There is now a base `OpenForceFieldToolkitException` class which all exceptions inherit from
If there's going to be custom errors, it probably makes more sense for these to (also) inherit from the Python core library errors. E.g. MissingDependencyError
in #1157 sounds a lot like an ImportError
If there's going to be custom errors, it probably makes more sense for these to (also) inherit from the Python core library errors
Based on my limited experience, I'm kinda against this. When I write try:; ...; except XYZError:
, I'm almost always writing with one particular erroring line of code in mind, and I want the except ...
block to contain really specific, relevant info/workarounds for that problem. If we start having widespread use of union types, I'd think that downstream coders will get surprised by their except
s catching lines that they didn't expect, and subsequently needing to dig into our exception hierarchy to figure out how to disentangle just the specific exception that they want to catch.
I'd be interested in other thoughts on this, though. ImportError
seems to be the strongest case for union types, given the nature of optional dependencies. But I had a fiasco a few months ago where it took me several hours to learn that the ValueError
I was catching deep in the stack was coming from RDKit and not the line I expected in the OpenFF toolkit.
I'm almost always writing with one particular erroring line of code in mind
You can still do this with XYZError, but for people who want to except
a general class of errors that they know could be raised (e.g. ImportErrors because the toolkit comes with optional dependencies), this requires them to not only dig through the code to locate and correctly import the potential error, but to branch through all possibilities and import all potentially raised errors, which then requires experience and high familiarity with the layout and organization of the code. I would classify anyone who can do this as an OpenFF Toolkit expert who probably wrote some of the code. What I would expect developers to actually do (and what I often do) is run through some test cases in a notebook and just make a long tuple of all exceptions that are eventually raised (except (MissingDependencyError, MissingPackageError, ToolkitUnavailableException, LicenseError)
). This also requires a non-trivial amount of work and familiarity. What I expect at least some users to do is a blank try/except Exception
, especially if they're just doing something quick and dirty, or a little prototype, and don't have the time to mess around with the toolkit code, or remember what an exception is called.
But I had a fiasco a few months ago where it took me several hours to learn that the ValueError I was catching deep in the stack was coming from RDKit and not the line I expected in the OpenFF toolkit.
ValueError
is the most generic of the errors -- IndexError
, KeyError
, TypeError
, AttributeError
all have fairly standard meanings, and often the main Python library depends on them in ways that break if you don't subclass these main errors (e.g. https://github.com/openforcefield/openff-toolkit/issues/986). One intermediate solution could be to avoid subclassing ValueError
s where possible.
I'd think that downstream coders will get surprised by their excepts catching lines that they didn't expect, and subsequently needing to dig into our exception hierarchy to figure out how to disentangle just the specific exception that they want to catch.
I personally would look at this from a user perspective. Keeping Toolkit-specific errors but subclassing generic ones still allows for approaches 1 and 2 above -- they can still just import whichever specific error they want to catch by commenting out the try/except
parts and letting the traceback tell them. However, it might minimise approach 3 (blank except) for the quick scripter by allowing them access to a generic class of "expected-things-going-wrong", especially if they vaguely know which exception is going to be raised but can't remember the very long name of the specific exception (e.g. FractionalBondOrderInterpolationMethodUnsupportedError
)
Thanks for the feedback!
I would classify anyone who can do this as an OpenFF Toolkit expert who probably wrote some of the code.
Ah... That is an important limitation to my perspective.
What I would expect developers to actually do (and what I often do) is run through some test cases in a notebook and just make a long tuple of all exceptions that are eventually raised (except (MissingDependencyError, MissingPackageError, ToolkitUnavailableException, LicenseError)). This also requires a non-trivial amount of work and familiarity. What I expect at least some users to do is a blank try/except Exception, especially if they're just doing something quick and dirty, or a little prototype, and don't have the time to mess around with the toolkit code, or remember what an exception is called.
Very helpful to know that this is the external developer workflow! I would look at something like this and be like "great! Users have so much control over the different things that go wrong", but what I'm getting from this is that they instead go "wow, this is a hassle, I'll just use except Exception
"
So, I get the sense that multiple-inherited exceptions are something that Actual Users will enjoy (and I see @richardjgowers, who is a very important Actual User, is also in that camp).
One intermediate solution could be to avoid subclassing ValueErrors where possible.
I like that. I hadn't thought to classify built-in exceptions between "generic" and "specific", but you've got a good point - ImportErrors
have a fairly well-bounded meaning, whereas ValueErrors
are indeed the wild west. So some rule like "YES inherit from ImportError, NO don't inherit from ValueError, SOMETIMES inherit from other errors" would be a good policy.
I personally would look at this from a user perspective. Keeping Toolkit-specific errors but subclassing generic ones still allows for approaches 1 and 2 above -- they can still just import whichever specific error they want to catch by commenting out the try/except parts and letting the traceback tell them. However, it might minimise approach 3 (blank except) for the quick scripter by allowing them access to a generic class of "expected-things-going-wrong", especially if they vaguely know which exception is going to be raised but can't remember the very long name of the specific exception (e.g. FractionalBondOrderInterpolationMethodUnsupportedError)
👍
Is your feature request related to a problem? Please describe. The toolkit, and our software in general, should raise descriptive, custom exceptions instead of re-using built-in exceptions.
Describe the solution you'd like We should have our own exceptions that are fairly specific and provide detailed feedback to the user as to what went wrong and why, and possibly hints at how to resolve the error. For example, this behavior (currently in the code) checks all the boxes:
Today, I went through most of the codebase and recorded when we raise built-in exceptions to get a better picture of how much would be changed by implementing this (nearly) everywhere in the codebase. Some of these (maybe a little less than a half?) probably fit well with existing exceptions, but many will require new exceptions. I have yet to fill out the last column.
NotImplementedError
topology/topology.py
:ValueError
ValueError
ValueError
chemical_environment_matches
got an argument that can't be converted to SMARTSValueError
ValueError
Exception
get_bond_between
Exception
Exception
topology/molecule.py
:Exception
molecule_atom_index
ormolecule_particle_index
when it does not belong to any moleculesValueError
Exception
Exception
molecule_bond_index
when it does not belong to any moleculesValueError
FrozenMolecule.__init__()
ValueError
Exception
FrozenMolecule._add_bond
Exception
Exception
Exception
chemical_environment_matches
got an argument that can't be converted to SMARTSValueError
FrozenMolecule.{to\|from}_iupac
Exception
Exception
from_file
needsfile_format
argument specifedException
to_file
couldn't find a toolkit to do its writing for itValueError
ImportError
from_qcschema
got something that's not JSON-encodableAttributeError
from_qcschema
was passed something without explicit hydrogen mapped SMILES or client otherwise failed to convert input to OFFMolKeyError
FrozenMolecule.remap
was given mapping with a different number of hydrogensValueError
get_bond_between
TypeError
ValueError
typing/engines/smirnoff/io.py
:ValueError
typing/engines/smirnoff/parameters.py
:TypeError
TypeError
TypeError
AttributeError
ParameterList.extend
with something not another instance of itTypeError
.add_parameter
TypeError
/ValueError
.add_parameter
ValueError
check_partial_bond_orders_from_molecules_duplicates
are isomorphicValueError
assign_partial_bond_orders_from_molecules
was told to use user bond orders, but not given anyValueError
k
ValueError
ElectrostaticsHandler
orToolkitAM1BCCHandler
found a particle that's not aTopologyAtom
orTopologyVirtualSite
ValueError
VirtualSiteHandler.add_parameter
ValueError
typing/engines/smirnoff/forcefield.py
:Exception
Exception
KeyError
IOError
RuntimeError
create_openmm_system
ValueError
KeyError
utils/utils.py
:get_data_file_path
failed to get anythingValueError
{a\|de}tach_units
ValueError
ValueError
utils/toolkits.py
(many are copied code across toolkit wrappers):ValueError
ValueError
Exception
from_iupac
)ValueError
RuntimeError
Exception
assign_fractional_bond_orders
was given an OFFMol without conformersException
ValueError
Exception
ValueError
Exception
hydrogens_are_explicit
, but detect implicit hydrogensValueError
ValueError
ValueError
RuntimeError
ValueError
Step 2 here could be to do a similar survey on which exceptions are actually used, possibly considering how often and/or how similar any are to others, to inform what sort of inheritance structure we want.
Describe alternatives you've considered Continuing with built-in exceptions is not ideal, long-term. We've already been slowly moving in this direction - most of our PRs the past few months are fairly aligned with the idea here - but slowly picking away at it won't provide the benefits of a more unified exception structure.
Additional context This idea has been thrown around in a few places (https://github.com/openforcefield/openforcefield/issues/514#issuecomment-585253391 started it) and a few different contexts but I don't think there's a stub issue.
Some other things to consider
openforcefield/exceptions.py
) or, as it stands now, scattered across files closer to where they'd be raised, or something in between?