Closed vaitkus closed 2 years ago
@vaitkus I would like to comment that ferrocene-like compounds still may be assigned a valid standard InChI/InChIKey if one permits a description as salt, i.e. the metal giving all valence electrons to the cp. Could this be an option for you?
In DW's GUI, Database -> retrieve Wikipedia compounds allows to access (at least a large section) of the chemicals of the English edition of the encyclopedia (compare with the interface here, corresponding open access publication) and to assign InChI via Chemistry -> From Chemical Structure -> Add Standard InChI. This approach does consider ferrocene, and if one replaces Fe(2+) to be any atom, titanocene, vanadacene, magnesocene (by charge separation, some one may consider more ionic, than others), the algorithm works on them, too:
At present, this approach covers 22 entries.
@nbehrnd, thank you for the suggestion, but that does not really directly address my issue. If I manually remove the zero-order bonds from the example entry 2015345, I can then successfully export it as InChI that represents the ferrocene-like complex as a salt. However, as you can image such manual removals are not desired or feasible for large datasets so I much more rather that this simple conversion (removal of coordination bonds) would be performed automatically by DataWarrior as is seemingly already done when converting to SMILES.
@vaitkus Assuming you used cod-tools for the transfer .cif -> .sdf, Kekulization seems to be a problem:
$ codcif2sdf 2015345.cif > cod_2015345.sdf
==============================
*** Open Babel Warning in PerceiveBondOrders
Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders (title is 2015345)
1 molecule converted
It looks like to propagate when requesting Jmol to display the structure (without automatic bond computation by Jmol), i.e. a missing double bond in one of the cp rings:
Greg Landrum's Cookbook for RDKit has an entry how to display dative bonds differently from single bonds. Maybe such a filter equally may be adjusted to trim-off organometallic bonds; but then fails because it is not performant enough for the COD.
O order bonds are not anymore passed to the Inchi-Builder
@thsa, thank you for a quick fix. Now works as expected.
@nbehrnd, the codcif2sdf script was our initial attempt at deriving bond orders, charges, etc. from crystallographic structures that relied on OpenBabel. However, since the results were not satisfactory for several classes of compounds (as was pointed out by your example), we developed are our own tool for this purpose (not yet published). The codcif2sdf and molcif2sdf scripts are not really used anymore and are mainly maintained for compatibility purposes.
Our current approach to describing metal complexes might be somewhat unconventional, but we will be happy to adapt it once (if) there is a clear consensus on the notation in the wider community.
If you have any further questions or comments on this topic, feel free to drop me a personal email (I think we have corresponded before).
Issue was tested with the latest available commit (c2e1776).
The attached example file InChI-example.txt is a DWAR file that contains two entries. One of the entries (COD ID 2236289) contains only regular bonds and can thus be saved as SMILES, InChI and InChIKey. The other entry (COD ID 2015345) contains zero-order bonds and thus fails to be exported as InChI and InChIKey, but not as SMILES.
Steps to reproduce the issue:
Result:
This is fully understandable, since InChI seemingly does not currently support bond types other than single, double, triple and aromatic [1]. However, maybe the zero-order bonds could be automatically removed by DataWarrior before attempting this conversion? The user would end up with a set of disjoint molecular entities instead of a single molecule (e.g. a metal complex), however, it would still be better than the current result. Actually, this seems to be already performed by DataWarrior when converting to the SMILES representation, that is, entry with COD ID 2015345 is successfully exported as SMILES:
[1] https://www.inchi-trust.org/technical-faq-2/#16.3, Section 16.3