ComPlat / chemotion_ELN

Electronic Lab Notebook
https://www.chemotion.net
GNU Affero General Public License v3.0
124 stars 46 forks source link

Display of (organo)metallic structures in the molecule editor and sample/reaction lists #1551

Open schatzsc opened 11 months ago

schatzsc commented 11 months ago

The display of organometal and coordination compounds is still a big mess, even with the latest version of Ketcher and the ChemDrawJS plug-in, see attached picture for examples.

In many cases, implicit hydrogen counts have to be set to zero and non-standard valences defined in order to force Chemotion to calculate a correct sum formula. These overlap with the actual structure and make it nearly unreadable, in particular in the little preview picture in list view and when displayed on a small tablet or laptop screen.

An option to switch display of these extra attributes off would be the minimum work-around but actually the messed up structures take away one major bonus of the use of the ELN - to automatically create a report. If in the generated report, all schemes and figures afterwards have to be replaced again by nicely drawn structures from a separate ChemDraw sketch, what's the use of drawing them in Chemotion beforehand?

organometallics organometallics2

schatzsc commented 11 months ago

Another example here: (top) nicely drawn structure in ChemDraw and (bottom) what Chemotion did to it after import as molfile to ketcher: chemdraw organometallics3

schatzsc commented 11 months ago

Still want to mark this issue as very urgent, as today we spent over one hour to modify a structure drawing of a NHC (N-heterocyclic carbene) complex in such a way that it showed the correct H atom count, since Ketcher and/or Chemotion added "implicit hydrogens" in a very hard to figure out way and it had major problems to handle the charges.

For example, if only one of two molecular fragments is charged and the other is not, then Ketcher/Chemotion adds charges to the other (neutral) fragment to arrive at an overall charge-balanced structure.

For a more detailed discussion on the issues of hydrogen representation im molfiles, see DepthFirst blog here

schatzsc commented 10 months ago

Just another example of the mess that is created when you try to obtain a correct sum formula on a totally standard organometallic NHC complex:

chemotion organometallics

schatzsc commented 10 months ago

Even more serious problem with metal complexes, which prevents us from obtaining a correct sum formula, although the problem seems to be more with Chemotion itself and the way it calculates the sum formula rather than Ketcher.

This is the molecule as drawn in Ketcher:

chemotion sum formula problem 2

Ketcher calculates the correct sum formula if one uses the blue "info" icon:

chemotion sum formula problem 1

After the editor is closed, Chemotion displays an incorrect sum formula which is one H too high, H30 instead of correct H29:

chemotion sum formula problem 3

However, the molfile as displayed by both Ketcher and inside Chemotion is correct, at the end it includes the proper charges as

M CHG 4 35 1 37 1 38 -1 39 -1

and the valences of the carbene C atoms

-2.7212 0.2853 0.0000 C 0 0 0 0 0 3 0 0 0 0 0 0 0.7997 0.2853 0.0000 C 0 0 0 0 0 3 0 0 0 0 0 0

pt_complex.mol.txt

This seems to be due to a problem with the routine that Chemotion uses to calculate the sum formula and is a critical bug for organometallic compounds, as we have been unable to adjust charges and valences in a way to adjust to the proper sum formula.

nicolejung commented 10 months ago

I used the file that you provided as mol.txt and got the attached result with 1.8.0 Screenshot (1082)

schatzsc commented 10 months ago

Yes, that's correct - 29H from three methyl groups, two mesityl aromatic protons and two NHC backbone CH in each of the "wings" and H3 from the central pyridine ring.

So some change from 1.7.3 to 1.8.0 made the difference ...

Our colleague from the IT center still has to develop an update schedule - generally we plan to have new Chemotion versions installed a week or two following major OS updates and will give each 1.x.0 version an additional little time in case minor follow-up updates become necessary.

Let's wait until we have 1.8.0 in use for a while before closing this issue, as we might dig up further interesting test cases.

Also need the NMR file import issues fixed first before next update - then might push this ahead for schedule

nicolejung commented 10 months ago

I added "urgent" as we started discussions about matching at least the most important requirements for inorganic drawings. Still it will need more resources and we need to clarify some general issues before. We are aware of the topics and will try to solve them - it needs time. In the meantime, we need to collect examples to avoid problems for those cases where we already have solutions. Clarification/Collection of structures with @JanCBrammer would be great

schatzsc commented 10 months ago

I'm in touch with Jan but as he will go on an extended holiday soon (or has already left for Aotearoa) we can only discuss this in mid-December 2023. I also think that illustrative examples should be discussed interactively - GitHub isn't really the best forum for that and describing in text is tedious - better have one person describe verbally the issues and show in a structure editor and another person taking notes, best somebody close to the coding. Let's give it another 6-8 weeks.

schatzsc commented 9 months ago

Tried the attached molfile again with the freshly installed version 1.8.0 and it still does not work - would need 13H but still get an incorrect hydrogen count of 14, even with the carbene C valence set to 2, which already is counterintuitive since it has 3 bonds.

Interestingly, the same file imported into ChemDraw Professional v19.0.1.28 gets me the correct sum formula of C13H13BrF6N5PPt

The main problem is that you don't even see from which heavy atom the "surplus H" is coming?!?

Editor inside Chemotion was the standard ketcher-rails but as regular user, does not show me the version.

Furthermore, the ketcher-internal info (? icon) sum formula is correctly given as C13H13...

Screenshot 2023-11-23 at 09-00-24 Chemotion carbene_complex.mol.txt Screenshot 2023-11-23 at 09-07-38 Chemotion

schatzsc commented 9 months ago

The problem with the incorrect sum formula quite clearly seems to trace back to OpenBabel called by Chemotion, as becomes evident from analyzing a combination of the "inchi" and "molreport" outputs of OpenBabel, where the culprit is identified as a broken Pt-Br bond, which the routine tries to "heal" by representing the now "lonely" Br as HBr.

I have opened a corresponding issue with the OpenBabel Github but I think we need to follow up to it here and not simply wait for OB to fix it:

https://github.com/openbabel/openbabel/issues/2656

nbehrnd commented 9 months ago

I'm not a user of the ELN here, but because of the issue addressed by @schatzsc in the forum of openbabel (here) I would like to equally share my speculation here: perhaps the underlying cause is using single bonds between the pincer complex, and the ion to be chelated. (And by consequence, sticking strictly to the octet rule, the sketcher marks atoms in red for going beyond the usual valence rules.)

Beside bond order of single, double, triple bond, the .sdf/.mol syntax (I refer to the v3000 dialect!) equally offers the dative bond of type 9 (cf link to a pdf on archive here, page 11). Can you please check if the program you use equally support this type?

It was an issue in a recent contribution to Avogadro addressed in a PR here which started with the IMes ligand and now depicts for instance the empty terpy as shown below -- in your case, the place holder disk would have to be substituted by Zn, Pt, or your ion of choice.

terpy

nbehrnd commented 9 months ago

An example with Marvin JS (test page):

example_marvin

to yield the mol file attached below. It is processed well e.g. in openbabel by

$ obabel marvinjs_untitled_file.mol -O test.png -xu
1 molecule converted

to yield

test

The modifying -xu only was used to disable the element specific colors, else Pt could be considered a bit too pale in front of a white background.

marvinjs_untitled_file.mol.txt

schatzsc commented 9 months ago

As has been extensively discussed in the IUPAC InChI Organometallics working group, metal-ligand bonds MUST NOT be broken because this will lead to ambiguity for example with ambidentate ligands with two binding sites A and B, in which you can have two constitutional isomers LM-A-B or A-B-ML (with L = ancillary ligands) which you cannot distinguish anymore when the M-L bonds are removed, and also since this leads to complete and irrecoverable loss of stereochemical information, for example cis- and transplatin.

The corresponding PubChem entries for those compounds are particularly pathological cases of this and - to speak with the words of Wolfgang Pauli - so bad they are not even wrong:

https://pubchem.ncbi.nlm.nih.gov/compound/cisplatin

Furthermore, although the "dative" bonds are beloved by some chemists in particular in main group organometallic chemistry, they actually have no relevancy in the context of inorganic and organometallic chemistry, since all metal-ligand bonds are more or less polar covalent bonds, unless you are in the limit of very large differences in electronegativity, like sodium chloride, which you can treat as ionic ([Na]+ and [Cl]-).

schatzsc commented 9 months ago

Your example wil the terpyridine-platinum unit will also not work, since terpy is not necessarily tridentate - there are also examples where it only acts as a bidentate or even monodentate ligand. We have a crystal structure published from my own group about the latter:

https://dx.doi.org/10.1039/C9CC04113C