Open nbehrnd opened 4 years ago
The tautomer rank is an "energy based" rank (in eV) which tries to estimate the relative energy difference between tautomers (without heavy quantum mechanics calculations). The lower the energy rank the "better". Negative rank values are allowed and actually these are the most table tautomers according to out energy estimation rules.
You are correct, my recall of section 4.6 Tautomer Ranking in Mol. Inf. 2013, 32, 481-504 was incomplete while posting the question about the negative values. Where the publication clearly states «The tautomer with the lowest rank is expected to be the most stable one. [...] The more stable state has always score 0.0 eV and the alternative one is with a higher energy. Additionally every atom which is part of an aromatic system gets an aromatic correction coefficient C_arom = - 0.1 eV.» (pp. 491-492, loc. cit.) right next to figure 13 with an example of rank 0, 0.037 and then -0.432 for tautomers of a methimazole.
So it is only a comment suggesting consistently using three to five decimals about the rank in lieu of sometimes three, and sometimes a dozen. Possibly the later, an almost «Fortranesque» accuracy of 12 decimals is rarely needed here. Thank you.
I'm starting to use ambit-tautomer (ambit-tautomers-2.0.0-SNAPSHOT.jar, downloaded today). With
O=C(c1c2cccc1)N([C@H](CCC(N1)=O)C1=O)C2=O
provided in a.smi
file, I runjava -jar ambit-tautomers-2.0.0-SNAPSHOT.jar -f thalomid.smi -o thalo.sdf
and read the results in DataWarrior. While browsing across the table, I noticed an inconsitent formatting of the rank order:
and
While I'm surprised about the occurence of negative rank orders, my first intent is to suggest a consistent formatting of this type of result; if it is a floating number, perhaps consistently to five decimals and -- so necessary -- with the addition of padding zeroes.
To ease replication of the observation, I add both start and resulting file; both padded by
.txt
to pass GitHub's settings. thalomid.smi.txt thalo.sdf.txt