Closed egonw closed 1 year ago
@egonw This question equally could be interesting for the public InChI mailing list (inchi-discuss@lists.sourceforge.net) for that winchi-1.06 (Windows GUI on top of contemporary InChI 1.06 by [2020-12-19 Sun]) may display this:
while running ./inchi-1
on InChI trust's reference executable for Linux does not yield an obvious toggle/flag to yield such an illustration in either .svg, or .png format.
Source for winchi: download section of InChI trust, entry INCHI-1-BIN.zip – Software binaries. Maybe with INCHI-1-SRC.zip – InChI Software source codes building a similar GUI for other platforms is eased.
Ah... you know I don't like InChi egon :-)
Maybe as an optional build, currently it is quite light weight. InChI makes things much larger, I don't think InChI numbers are useful.
Maybe to help with some additional context on how could InChI numbering be useful:
When you have a structure with an undefined stereocenter, but you are tired and cannot spot which one it is, converting it to InChI gives you the atom number (according to InChI) so then coming back to the depiction with the same numbering could be useful?
Or is there already something allowing this in the CDK?
How are you inputing the structure?, normally the sketcher will show you this.
Not sure to entirely understand what you mean.
Starting from a SMILES (let it be CCC(C)C(=O)OC[C@@]12[C@@H](O)[C@H](C[C@@H](C)[C@]11OC(C)(C)[C@@H]([C@H]1OC(C)=O)[C@H](OC(=O)C1=CN(C)C(=O)C=C1)C2=O)OC(=O)C1=CN(C)C(=O)C=C1
), how would you highlight the undefined stereocenter for a half-sleeping chemist?
(not sure I wanna pollute the original issue of Egon with this question, tell me if worth opening another one)
I'll see if I can work out some Java/Groovy code first to resemble the above functionality.
How would a sleepy chemist pair the InChI numbers back to the SMILES string from a depiction?
It would be easy for us mark constitutional stereocenters in depict with a (?) and does not need InChI. The hard part is interdependent stereo but since that's not possible in general we can support the simple case. Please open another issue mark missing stereo for that.
Taking my original
CCC(C)C(=O)OC[C@@]12[C@@H](O)[C@H](C[C@@H](C)[C@]11OC(C)(C)[C@@H]([C@H]1OC(C)=O)[C@H](OC(=O)C1=CN(C)C(=O)C=C1)C2=O)OC(=O)C1=CN(C)C(=O)C=C1
converting it to InChI leads to
InChI=1S/C36H44N2O13/c1-9-18(2)31(44)47-17-35-28(42)23(49-32(45)21-10-12-24(40)37(7)15-21)14-19(3)36(35)30(48-20(4)39)26(34(5,6)51-36)27(29(35)43)50-33(46)22-11-13-25(41)38(8)16-22/h10-13,15-16,18-19,23,26-28,30,42H,9,14,17H2,1-8H3/t18?,19-,23+,26-,27+,28+,30-,35+,36-/m1/s1`
where we can see 18 ?, so we know it is atom 18 according to InChI. We could then trace it back to the depiction if atoms were InChI-numbered? Anyway, opened #54
@Adafede, John and I discussed the recognition of missing stereo, and conclusion was to use the CDK for that.
OK so now which Atom in your SMILES string is 18? I feel this is a case of "the XY problem" :-)
John, if you number atoms in the depiction with the InChI atom numbers, that would be clear, not?
I was about to answer something similar. I do not want to see which string in my SMILES is undefined, but which carbon is on my depiction. With #54, even more straightforward, I would not need InChI indeed. Probably relevant for other applications still?
Fixed
Sorry not fixed... issue was conflated.
Closing - won't fix.
My reservations are:
Here is the trivial code to do it via the CDK:
long[] numbers = InChINumbersTools.getNumbers(atomContainer);
for (IAtom atom : molecule.atoms()) {
atom.setProperty(StandardGenerator.ANNOTATION_LABEL,
Long.toString(numbers[atom.getIndex()]));
}
long[] numbers = InChINumbersTools.getNumbers(mol);
for (IAtom atom : mol.atoms()) {
// atom.setProperty(CDKConstants.ATOM_ATOM_MAPPING, (int)numbers[atom.getIndex()]);
atom.setProperty(CDKConstants.COMMENT, (int)numbers[atom.getIndex()]);
}
SmilesGenerator smigen = new SmilesGenerator(SmiFlavor.Default + SmiFlavor.AtomAtomMap);
System.out.println(smigen.create(mol));
CN1C=NC2=C1C(=O)N(C(=O)N2C)C |$_AV:1;10;4;9;6;5;7;13;12;8;14;11;2;3$|
Perhaps a compromise is we add method to the CDK library depiction generator? The cdk-inchi would then be an optional dependency?
new DepictionGenerator().withAtomNumbers().depict(mol); // current
new DepictionGenerator().withAtomValues().depict(mol); // current
new DepictionGenerator().withInChINumbers().depict(mol); // addition (requires cdk-inchi)
I would welcome an option to number the atoms according to the numbering in the InChI. Can I request that, please?