Closed BobHanson closed 6 years ago
We have rested upon the types already presented in the CIF format. But of course any other reasonable types are welcome. So you could just propose a list of additional types and their symbols.
@BobHanson , can you give a simple list of additional bond types and definitions as a reply to this issue? Don't worry about taking the time to be exhaustive, just include what you would like to see (hex,penta, aromatic,partial, partialdouble - is that the lot?)
Here is Jmol's repertoire. Many, I am sure, are irrelevant. I can't remember the difference between Partial23 and Partial32.
private enum EnumBondOrder {
SINGLE(BOND_COVALENT_SINGLE,"1","single"),
DOUBLE(BOND_COVALENT_DOUBLE,"2","double"),
TRIPLE(BOND_COVALENT_TRIPLE,"3","triple"),
QUADRUPLE(BOND_COVALENT_QUADRUPLE,"4","quadruple"),
QUINTUPLE(BOND_COVALENT_QUINTUPLE,"5","quintuple"),
sextuple(BOND_COVALENT_sextuple,"6","sextuple"),
AROMATIC(BOND_AROMATIC,"1.5","aromatic"),
STRUT(BOND_STRUT,"1","struts"),
H_REGULAR(BOND_H_REGULAR,"1","hbond"),
PARTIAL01(BOND_PARTIAL01,"0.5","partial"),
PARTIAL12(BOND_PARTIAL12,"1.5","partialDouble"),
PARTIAL23(BOND_PARTIAL23,"2.5","partialTriple"),
PARTIAL32(BOND_PARTIAL32,"2.5","partialTriple2"),
AROMATIC_SINGLE(BOND_AROMATIC_SINGLE,"1","aromaticSingle"),
AROMATIC_DOUBLE(BOND_AROMATIC_DOUBLE,"2","aromaticDouble"),
ATROPISOMER(TYPE_ATROPISOMER, "1", "atropisomer"),
UNSPECIFIED(BOND_ORDER_UNSPECIFIED,"1","unspecified");
On Wed, Mar 21, 2018 at 11:58 PM, jamesrhester notifications@github.com wrote:
@BobHanson https://github.com/bobhanson , can you give a simple list of additional bond types and definitions as a reply to this issue? Don't worry about taking the time to be exhaustive, just include what you would like to see (hex,penta, aromatic,partial, partialdouble - is that the lot?)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/COMCIFS/TopoCif/issues/19#issuecomment-375180164, or mute the thread https://github.com/notifications/unsubscribe-auth/AQ7RW8dn1IdI8y3RjMbRkUdKJ1R3x_E0ks5tgy-IgaJpZM4Slka5 .
-- Robert M. Hanson Professor of Chemistry St. Olaf College Northfield, MN http://www.stolaf.edu/people/hansonr
If nature does not answer first what we want, it is better to take what answer we get.
-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
Great - can you give a one sentence description of the ones that aren't obvious (I have no idea about 'strut' or any of the partials or atropisomer). Also, I forgot to say that it is desirable that these bond types shouldn't overlap so that we can make it easy on the data miners. For example, when does a partial bond become a full bond? Is there a clear demarcation? If not, we may want to simply say 'no bond' instead of partial and write in the definition that 'no bond' includes partial bonds.
If we want to adjust the format for any bond type we could admit numbers (1, 2, ...) in _topol_bond.type
item and introduce two additional items:
_topol_bond.type_id
, which must be equal to the number specified in _topol_bond.type
and
_topol_bond.type_description
with any description of this type of bond
You can ignore the non-obvious ones. I was just showing the (roughly) full range of what Jmol can do. Actually, Jmol can draw any partial bond (solid and dashed both) using a binary number description -- for example, 5 (binary 101) means "solid,dashed,solid" -- but I would not foist that on anyone. I would say the most useful would be (with possible rendering interpretations):
single-hex (up to six just for completeness), partial-0.5 (a dashed line, probably, when rendered), partial-1.5 (a solid and a dashed line), partial-2.5 (two lines and a dash), aromatic (probably like partial-1.5, but possibly a circle in an n-gon), aromatic-single (single, but tagged as aromatic) aromatic-double (double, but tagged as aromatic)
The problem with all of these is that they are subjective interpretations. But, I guess, so is all topology (?)
Bob
We can either go in the direction suggested by @Blatov , and introduce a separate table of bond types that can be expanded as needed, and referred to in _topol_bond
:
_topol_bond_type.id
_topol_bond_type.description
sh 'A single hex bond'
p25 'A 2.5 partial bond'
or we can just add some extras to the current list. Are there any preferences? Part of this depends on the purpose of this data name - e.g. from @BobHanson 's point of view, it provides information on how links could be displayed but is not used further (I assume). Is there another use for this information that might be stricter (e.g. characterisation of lattice edges as strong or weak that can't otherwise be carried out based on distance etc.)?
Predefined set of the most prominent bond types plus the means to extend it would be the best solution, IMO. I think that machine-parsing the natural language definitions from _topol_bond_type.description
is the least one would want to do to. However, for really exotic cases there is not much else to do. In any case it would be great if novel bond types could be accepted to the enumerator in the dictionary.
So I think we should accept the _topol_bond_type.id
and _topol_bond_type.description
items.
I agree with Blatov, we should accept the _topol_bond_type.id and _topol_bond_type.description items.
excellent. Thank you.
On Fri, Apr 6, 2018 at 3:43 AM, Davide M Proserpio <notifications@github.com
wrote:
I agree with Blatov, we should accept the _topol_bond_type.id and _topol_bond_type.description items.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/COMCIFS/TopoCif/issues/19#issuecomment-379188652, or mute the thread https://github.com/notifications/unsubscribe-auth/AQ7RW-frB8D5UZcNoTpq9Wbj5guisF-_ks5tlyrEgaJpZM4Slka5 .
-- Robert M. Hanson Professor of Chemistry St. Olaf College Northfield, MN http://www.stolaf.edu/people/hansonr
If nature does not answer first what we want, it is better to take what answer we get.
-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
Only question I have is this:
Does this mean that, say, I could define "p25" to mean one thing in one CIF, and you could define it to be another in a different CIF? Is that a problem?
Bob
On Fri, Apr 6, 2018 at 8:05 AM, Robert Hanson hansonr@stolaf.edu wrote:
excellent. Thank you.
On Fri, Apr 6, 2018 at 3:43 AM, Davide M Proserpio < notifications@github.com> wrote:
I agree with Blatov, we should accept the _topol_bond_type.id and _topol_bond_type.description items.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/COMCIFS/TopoCif/issues/19#issuecomment-379188652, or mute the thread https://github.com/notifications/unsubscribe-auth/AQ7RW-frB8D5UZcNoTpq9Wbj5guisF-_ks5tlyrEgaJpZM4Slka5 .
-- Robert M. Hanson Professor of Chemistry St. Olaf College Northfield, MN http://www.stolaf.edu/people/hansonr
If nature does not answer first what we want, it is better to take what answer we get.
-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
-- Robert M. Hanson Professor of Chemistry St. Olaf College Northfield, MN http://www.stolaf.edu/people/hansonr
If nature does not answer first what we want, it is better to take what answer we get.
-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
sorry -- hit send too quickly.
Also,
"single hex" is an oxymoron. A bond can't be both a single bond and a hex bond. Wouldn't "sh" just be "6"?
Bob
On Fri, Apr 6, 2018 at 8:07 AM, Robert Hanson hansonr@stolaf.edu wrote:
Only question I have is this:
Does this mean that, say, I could define "p25" to mean one thing in one CIF, and you could define it to be another in a different CIF? Is that a problem?
Bob
On Fri, Apr 6, 2018 at 8:05 AM, Robert Hanson hansonr@stolaf.edu wrote:
excellent. Thank you.
On Fri, Apr 6, 2018 at 3:43 AM, Davide M Proserpio < notifications@github.com> wrote:
I agree with Blatov, we should accept the _topol_bond_type.id and _topol_bond_type.description items.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/COMCIFS/TopoCif/issues/19#issuecomment-379188652, or mute the thread https://github.com/notifications/unsubscribe-auth/AQ7RW-frB8D5UZcNoTpq9Wbj5guisF-_ks5tlyrEgaJpZM4Slka5 .
-- Robert M. Hanson Professor of Chemistry St. Olaf College Northfield, MN http://www.stolaf.edu/people/hansonr
If nature does not answer first what we want, it is better to take what answer we get.
-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
-- Robert M. Hanson Professor of Chemistry St. Olaf College Northfield, MN http://www.stolaf.edu/people/hansonr
If nature does not answer first what we want, it is better to take what answer we get.
-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
-- Robert M. Hanson Professor of Chemistry St. Olaf College Northfield, MN http://www.stolaf.edu/people/hansonr
If nature does not answer first what we want, it is better to take what answer we get.
-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
If the bond is user-defined, the name and definition is up to the user. So, yes, the same _topol_bond_type.id
could be assigned to different kinds of bonds in different cifs. But I do not see here any real problem - for example the same atom type can also mean different for different authors.
And indeed we should correct A single hex bond
to just A hex bond
. We can use 6
or hx
, for example, for _topol_bond_type.id
in this case.
I didn't think through the full implications of @Blatov 's proposal, but fortunately @BobHanson is alert. It is essentially unworkable to have a custom table for bond types in every CIF. The reason it is unworkable is that the CIF dictionary is supposed to be defining data names that allow automated processing of files. Any value that depends on free-form text is useless for machine processing, so, for example, @BobHanson couldn't display any bonds by bond type in his software because his software would have to understand arbitrary text strings first.
A more philosophical objection is that the standard represents an agreement on meaning between two parties that are not otherwise in contact. If a data file can contain arbitrary bond classifications, then there is essentially no agreement on bond types and the information is not suitable for the standard. That is why any list of bond types should be in the TopoCif dictionary. We can add to this list in the future (but never subtract).
So I am in favour of the original scheme, with the subset of bonds that we do agree on and understand explained in the dictionary (not the data file).
Ok, I agree that we should not give too much freedom for the user here. But at the same time it would be good if the list would have some flexibility. We could keep _topol_bond_type.id
and _topol_bond_type.description
items but allow to use only predefined bond types for _topol_bond_type.id
. This could be important if the user wants to provide some additional information on the bonding. And for flexibility we could predefine one more type ud
means user defined
; this type would designate some special bond, and the program could output the its description if required.
I strongly agree with @jamesrhester. We need to have machine-readable topology descriptions.
My suggestion for a starting point is:
bonds single-hex partial bonds 0.5 1.5 2.5 3.5 4.5 5.5 (just for completeness) hydrogen bonds
That would certainly cover anything of general interest in Jmol.
These could be coded any way you want. Anything wrong with actually using numbers there for all those 1-6 and 0.5, 1.5, ...? Or does everything have to be a code like s,d,t,..?
Then the only special one would be hydrogen bonds, which might be special anyway, because they can have energies attached -- though perhaps that is outside the scope of this CIF format.
Bob
On Mon, Apr 9, 2018 at 12:47 PM, Andrius Merkys notifications@github.com wrote:
I strongly agree with @jamesrhester https://github.com/jamesrhester. We need to have machine-readable topology descriptions.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/COMCIFS/TopoCif/issues/19#issuecomment-379835941, or mute the thread https://github.com/notifications/unsubscribe-auth/AQ7RWz0ZKFi6sJPqTXTX74saCY09q751ks5tm56UgaJpZM4Slka5 .
-- Robert M. Hanson Professor of Chemistry St. Olaf College Northfield, MN http://www.stolaf.edu/people/hansonr
If nature does not answer first what we want, it is better to take what answer we get.
-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
I think instead of introducing many types for different bond order, it would be better to introduce just one more item_topol_link.order
, which would be a real number [0..infinity]. And we could keep only special types for single, double, triple and quadruple bonds, so keep the list of types as it is now in the current version of the dictionary. We can forbid user types for bonds, but anyway, I think we need a field _topol_link.description
for an arbitrary description of the bond features. There is a similar item in the ATOM category: _atom_site_description
.
I also noticed that there is one mentioning of _topol_bond
instead of _topol_link
in the dictionary, when we explain the TOPOL_REPRES
subcategory. This should be fixed.
I don't understand how _topol_link.order
would interact with _topol_link.type
. If I have a double bond, should I include both _topol_link.type
of db
and _topol_link.order
of 2
? If I have an aromatic bond, or a Van der Waals link, what would _topol_link.order
of 2
mean? I'm not necessarily against the idea, but perhaps @Blatov could write a definition for _topol_link.order
explaining the precise usage. Is the intention that this is purely for display purposes or is there chemical or topological significance?
Meanwhile, I agree that we should add _topol_link.special_details
for any user-specific information.
_topol_link.order
should be understood in a common chemical meaning as the bond order (number of electron pairs per bond). But it should not be a mandatory field. It could extend the bond description and supply the _topol_link.type
information. If we want to describe the bond order we should use the general type of valence bond (v). These two constructions
_topol_link.type
v
_topol_link.order
2
and
_topol_link.type db
are equivalent.
The construction
_topol_link.type
sg
_topol_link.order
2
is strictly speaking conflicting, but the program can ignore _topol_link.order
field if the order is predetermined by _topol_link.type
(sg
db
tr
qd
).
For van der Waals interaction the following data could be feasible:
_topol_link.type
vw
_topol_link.order
0.01
The scheme suggested by @Blatov is a bit messy in that it allows contradictory information to be easily presented. I suggest that we stick with bond types that are qualitatively different ( v
, vw
, pi
, hb
and ar
?) and then _topol_link.order
can be used to distinguish the different number of electron pairs involved. If a bond has no concept of strength, topol_link.order
can be ignored. How does that sound?
This is exactly what I meant; sorry if I was not clear enough. _topol_link.order
should be optional and used only if we want to specify the bond strength.
See update e14c81b and let me know if this is satisfactory. Feel free to make your own edits to the dictionary.
I think we should leave specific bond
in the list of bond types. This type could be used for any bonding intermediate between valence and van der Waals, like halogen, chalcogen or other recently proposed types of weak bonding.
How should specific bond
be used? Is it just the value sb
for _topol_link.type
and the details given in special_details
? Or is there somewhere else that halogen, chalcogen etc. should be specified?
Yes this is just an additional value for _topol_link.type
. We used s
in the previous dictionary edition, but sb
is probably better. The point is that we need a special type for contacts, which are neither valence nor van der Waals. There can be a lot of subtypes for them, like for valence contacts, but the details can be described in special_details
. Actually we already use one subtype for valence bonds (ar
), and one subtype for specific bonds (hb
), but we need also values for general types (valence and specific). Halogen, chalcogen etc. are subtypes of specific bonds, but no need to use special values for them at the moment.
I've added sb
and _topol_link.special_details
to the list of bond types in 6ad75db
Closing issue as there seem to be no further suggestions or objections.
There are actually more bond types. For example, https://www.sciencedirect.com/science/article/pii/S0009261410004008 describes a possibility of a hex-bond. Jmol can represent this; I would like to see 5- and 6-bond options here.
In addition, there are all sorts of partial bonding possibilities. In Jmol we can represent aromatic, partial, or partialDouble among many other possibilities.