Open tlfobe opened 1 year ago
Thanks for the great writeup, @tlfobe! I'd thought about doing something like this in #471 but got scared away by how complex the ToolkitWrapper
and ToolkitRegistry
interface already ended up being. I think I'm still generally against this change because of 1) the implementation cost, complexity, and maintenance burden it would add, 2) the importance of committing to a stable plugin interface for downstream devs, and 3) the likelihood that this would introduce silent errors.
As a recap, the call stack for something like offmol.assign_partial_charges
is:
Molecule.assign_partial_charges
, toToolkitRegistry.call
, to ToolkitWrapper.assign_partial_charges
, toThe first three calls are in our public API, so we must ensure that they have the right behavior both when called directly by the user and also when called by by higher methods in the stack.
Also, as ToolkitRegistry.call
iterates through each ToolkitWrapper
s it contains...
ToolkitWrappers
until it either succeeds or returns a mega-error containing output from each ToolkitWrapper
it tried. (This is a bonkers system that I regret implementing, but it's in production now and it gets the job done, and it's frankly hard to think of a simple alternative. I keep meaning to get around to documenting this but I haven't had time to work on the developer docs in a few years :-( )
So to add support for arbitrary kwargs would require rethinking/modifying the function signature of several things in Molecule
/ToolkitRegistry
/ToolkitWrapper
(or adding burdensome kwarg registries for all TKW
methods), determining which exception types may be raised by TKW
s for unrecognized kwargs (or by new failure modes triggered by the value of a kwarg), and how Molecule
methods should react to each type of error (that is, which raise_exception_types
value it should pass in).
So... it would be a lot of work for us to implement this correctly.
It wouldn't just take our time - All external folks who maintain ToolkitWrappers
will also have the obligation put on them to implement a kwarg registry and raise the correct sorts of exceptions/have the right behaviors that Molecule.assign_partial_charges
expects when it sets raise_exception_types
.
Strategically, downstream developers are super important for OpenFF. We have very limited personal bandwidth to engage with other devs, so our long-term strategy needs to rely on passive incorporation of our tools. Our "contract" with downstream devs is the plugin interface, so to change that and put work on their shoulders is a big ask. Having a plugin interface that only has positional and named kwargs is a commitment on our part to not slip changes through without an easily-discoverable commit trail.
Method signatures ending in **kwargs
have a bad smell to me - I've had several experiences where this leads to methods silently ignoring something important and giving users the wrong results. I think a correctly-implemented kwarg handler wouldn't have this problem, but bugs where a kwarg handler lets in something it shouldn't could land in the worst category of errors ("silent but deadly"). Matplotlib/pyplot come to mind for this (but at least those outputs are immediately visualized so users can spot errors!)
So, given that there are available workarounds[1] and that this would be a lot of work for us and other to implement and maintain, I probably wouldn't accept a PR on this. I do appreciate that you opened this issue to discuss it, though! I'll leave this open for discussion, and while I could imagine that we could find a design that resolves my concerns with items 2 and 3, the limited benefit doesn't seem like it'd be worth the work of overcoming item 1.
[1] instead of changing any source code, I think you could make a new ToolkitWrapper that has your desired changes to charge assignment, import it, run TLFobesToolkitWrapper.assign_partial_charges(offmol)
, and then do ff.create_openmm_system(top, charge_from_molecules=[offmol])
Is your feature request related to a problem? Please describe. I'm using the OpenFF-toolkit to parameterize some larger systems (300+ atoms), and I currently have a bootleg work around, where I hardcode the OpenEye
assign_partial_charges
method to have the keyword argumentmaxAtoms = 500
, letting me parameterize larger systems. I can then install this local version to my conda environment, and things work. (yes these take forever to parameterize, usually I can get something to finish parameterizing in ~12-24 hours if I'm using the OpenEye implementation of AM1-BCC).I was thinking it might be useful to other users to be able to pass keyword arguments to the underlying toolkits, so that in cases like this there is a way to access some of the options in those toolkits.
I could see where this might be problematic since not all the toolkits have the same keyword arguments, but maybe a keyword argument handler could be used here to elegantly pass the right keyword parameters to the underlying toolkit functions and throw a warning if keywords are not used.
Describe the solution you'd like Adding some instances where keyword arguments to underlying toolkit are passeable to some of the top-level funcitons would make the toolkit a little more flexible for non-standard uses. In my case the
assign_partial_charges
, but I think there are several instances where something like this might be desireable.Initial Example
I have a crude working example here with changes to the
Molecule.assign_partial_charges
method:Where I just pass a **kwargs to the
assign_partial_charges
method. I also changed the openeye_wrapper's version of theassign_partial_charges
method to accept kwargs as well.I'd be happy to work on a PR for this if it's of interest to the OpenFF community!