Closed baoilleach closed 9 years ago
It seems that the SMILES does not represent correctly the aromaticity and is not parsable. This is why the molecule does not appear. It should be in the list of errors that is available from http://www.cheminfo.org/wikipedia We don't know why many SMILES use the lowercase atom name to describe aromaticity which always give some troubles. In order to solve the problem you just need to put SMILES with localized double bonds. The update is done nightly.
Gotcha - the Trp is n instead of [nH] in each of these cases. I'll fix them as I find them.
Great ! Thanks !
As mentioned in our article in J.Cheminformatics, the "pyrrole nitrogen" problem was clearly the most common error in Wikipedia SMILES, occuring more than 350 times. Many these errors have been fixed by the project team, but many still remains. Best peter
Great ! Thanks !
— Reply to this email directly or view it on GitHub https://github.com/cheminfo/wikipedia/issues/29#issuecomment-141091123.
Yes - I should have read the paper properly. I've since discovered the source of those SMILES - maybe we can discuss offline.
I've been cross-checking some data I've extracted from Wikipedia versus the dump file in your github repo and noticed some discrepancies.
For example, SMILES data from the drugbox on https://en.wikipedia.org/wiki/Lanreotide or the chembox on https://en.wikipedia.org/wiki/Hemorphin-4 is not included in the dump file.
Is there some reason for this or is it a bug?