IUPAC-InChI / InChI

Main InChI repository
MIT License
60 stars 7 forks source link

Atomic isotopes need to be fixed #51

Closed stuchalk closed 1 week ago

stuchalk commented 2 weeks ago

This issue is posted based on a request to the submitter (Stuart Chalk) from the BIPM. The BIPM's use case is to unambiguously identify isotopes of atoms that are used in the definition of standards for time and frequency (see https://www.bipm.org/en/publications/mises-en-pratique/standard-frequencies).

The current (and historical) InChI creation of single atoms of common elements is to convert them to the fully protonated form, for example:

whereas for other atoms of isotopes of other elements InChI gives what would be expected of a single atom.

Please consider this a request to update the InChI code to improve the consistency of InChI generation.

gblanke02 commented 1 week ago

Carbane14 taken from PubChem is 14CH4 resluting in InChI=1S/CH4/h1H4/i1+2 (InChI Web demo) or as molfile

Ketcher 9 42417212D 1 1.00000 0.00000 0

1 0 0 0 0 0 0 0 0 0999 V2000 2.4500 -2.4000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 M ISO 1 1 14 M END

If you want to identify cabon14 you have to use elemental Carbon atoms like in the following molfile where the valence of the C atom is set to 0:

ACCLDraw09042417242D

1 0 0 0 0 0 0 0 0 0999 V2000 4.8438 -5.7188 0.0000 C 2 0 3 0 0 15 0 0 0 0 0 0 M ISO 1 1 14 M END The InChI Web demo delivers InChI=1S/C/i1+2 as requested.

Similar is valid for methane and carbene.

About Germanium: Like for the C compounds discussed above you have to differentiate between GeH4 and elemental Ge resulting in InChI=1S/GeH4/h1H4/i1+12 for 85GeH4 and InChI=1S/Ge/i1+12 for 85Ge. in PubChem you find the second representation type (unlike the first verion for the C atoms).

According to these results it looks like that the InchI does it job if the correct molfile is entered.

Tools used: InChI-Web-Demo: https://iupac-inchi.github.io/InChI-Web-Demo/ with InChI 1.07.1 on $-September-2024 Biovia/Draw (2017) to build the elemental representations for C and Ge. (Sketcher - the chemical editor used in the InChI-Web-demo - delivers an incomplete molfile for the "elemntal case". Therefore Biovia/Draw was used to build the molfile that was filled into the molfile calculation section of the InChI-Web_demo.

stuchalk commented 1 week ago

Thanks for the quick response to this issue. This makes complete sense to me that it is a molfile issue. Let me circle back to PubChem with this information...

gblanke02 commented 1 week ago

Beacsue this is a creation issue of the molfie that is digested by InChI the issue is under control of the molfile creator. In the examples above it depends on the rules of PubChem. What has been recognized that the sketcher editor used in the InChI Web Demo is not able to create the appropriate molfiles as well. Instead we nused Biovia/Draw 2017 and copied the molfile into the "molfile section" of the InChI Web Demo.

flange-ipb commented 1 week ago

What has been recognized that the sketcher editor used in the InChI Web Demo is not able to create the appropriate molfiles as well.

Unfortunately, the Ketcher structure editor we use in the InChI Web Demo has a bug in its Molfile export function where the valence 0 of atoms is not taken into account. I already submitted a bug report.