Open avirshup opened 8 years ago
This may be as simple as needing to store:
And having a convention about implicit hydrogens (my preference: no implicit hydrogens allowed; missing hydrogens are fine)
Strong suggestion: use explicit hydrogens or explicit atom types which define explicit hydrogen counts. For bond orders, make a clear choice, e.g. only integer (formal) bond orders. Also, don't expect any tool to know what you mean with "aromatic" (they don't and you will loose interoperability when you start using this term).
@egonw - I basically 100% agree with all of this.
To do due diligence, though: besides aromaticity (which we can deal with via Kekulization), has anyone run across any other use cases for fractional bond orders?
ParmEd, for instance, uses 1.5 for aromatic and 1.25 for amide. Not sure how this works in actual use cases though:
In [5]: parmed.Bond?
Init signature: parmed.Bond(self, atom1, atom2, type=None, order=1.0)
Docstring:
[...]
order : float, optional
The bond order of this bond. Bonds are classified as follows:
1.0 -- single bond
2.0 -- double bond
3.0 -- triple bond
1.5 -- aromatic bond
1.25 -- amide bond
Default is 1.0
I'll get back to the bigger threads here soon, but the bond order thing piqued my interest and connects with something we've been working on, @avirshup . In our new Open Forcefield initiative SMIRFF forcefield format, we just implement support for partial bond orders such as Wiberg bond orders. These are floats; our interest in them is that they allow the potential for interpolation between single/aromatic/double depending on conjugation, etc. (It's worth noting that the AMBER parameters for, say, a carbon-carbon aromatic bond in benzene fall linearly in between the parameters for a single and a double bond so interpolating based on a partial bond order of 1.5 give you essentially exactly the "right" parameters for that specific case.
There's a lot of basic science to be done on this, but the idea of allowing for this is that it allows the surrounding chemistry (which impacts the partial bond order) to potentially impact the bond or torsional parameters that are assigned in a sensible way without having explicitly specify parameters for every bond order which might be of interest -- you just hit the "normal" ones and let the fractional bond order do the rest of the work via interpolation.
So, um, I'd advocate for supporting fractional bond orders, honestly -- if for no other reason than that we have major plans involving fractional bond orders. :)
@davidlmobley - that sounds really cool!
Could this all be addressed with explicitly alchemical data structures? These would be objects that break all sorts of normal chemical rules - allowing for non-integer atomic numbers, non-integer bond orders, atom types that are linear combinations of other atom types, etc.
EDIT: sorry, got hung up on the "interpolation" thing, when this is actually much more interesting than that - see below
@avirshup -- this is actually a separate thing from free energy calculations entirely. The point is that, in "real chemistry" due to conjugation, etc., bond orders (i.e. Wiberg bond orders) often end up being non-integer due to electrons getting pulled here and there. That is, bonds don't end up being strictly single or double or whatever.
This is related to the issue where nitrogen with three connections can be tetrahedral or planar depending on what's connected to it, or even in between tetrahedral and planar with the right environment. Existing forcefields basically just have to bin everything into "these atom types are tetrahedral" and "these are planar". The though we're operating on here is that building a bit more chemistry -- i.e. via things like partial bond order -- can allow us to catch the "in between cases" just via interpolation.
Relating to alchemical calculations -- I agree there could be interest in defining atom types that are explicitly alchemical (in between other types, etc.), though that's really more an issue of force field parameters than of anything else.
However, that's not what I was talking about here -- here I was talking about tracking information that we will need for molecules in order to be able to assign our force field parameters to the molecules themselves. The one we're thinking about the most at the moment is the partial bond order... :)
I have to run off, but in case I still didn't explain clearly enough, you might just check out this comment as it illustrates a specific example for the case of carbon-carbon bonds like I mentioned above:
https://github.com/open-forcefield-group/smarty/pull/134#issuecomment-244495965
I can also point you to an IPython notebook to play with this if you get interested.
@davidlmobley: Gotcha. When you've got time, yeah, I'd love to see a notebook! And, scientifically, it sounds seems like it could be an interesting intersection of chemical graph theory, force fields, and quantum chemistry - if you have a recommended reference, would definitely be interested in reading up on it.
@avirshup -- here's a notebook, though you'd need an OpenEye license (and the October betas of OpenEye tools) installed to be able to actually fiddle with it: https://github.com/open-forcefield-group/smarty/blob/master/examples/partial_bondorder/test_partialbondorder.ipynb
I don't have a reference yet. We're writing this up starting around now, but it's basically something we came up with Christopher Bayly over the summer that I think has been stuck in his head for a decade or two and is finally coming out... :)
This may be as simple as needing to store:
- bond orders
and connectivity! (Though it's not clear to me how you would denote bond orders if you didn't have this)
- atomic numbers
- formal charges
- And having a convention about implicit hydrogens (my preference: no implicit hydrogens allowed; missing hydrogens are fine)
I think I agree with this, @avirshup , except that extensions probably are presently needed and it also needs to be easily extendable (via something like SD tags used in SDF files) to include any other type of info someone would want to carry along. Obvious candidates for right now would be partial charges and partial bond orders; perhaps partial charges should be provided always but be empty if no charges are available (as in .mol2 files)
You probably also want to allow people to carry around atom types and atom names if they want to...
Lightly edited discussion, migrated from slack:
@avirshup:
@davidlmobley:
@jchodera:
@avirshup:
@jchodera:
@avirshup