mSorok / NaturalProductsOnline

Website code for COCONUT
https://coconut.naturalproducts.net/
33 stars 11 forks source link

Flavonoid structures #114

Open araskind opened 1 year ago

araskind commented 1 year ago

Hi,

I am compiling the database of biological compounds from different sources and I came across an issue in COCONUT with a number of flavonoid structures. One example is CNP0438752 (CNP0438752). In COCONUT it's SMILE string is OC1=CC=C(C=C1O)C2[OH+]C=3C(O)=C(O)C=CC3C=C2O and formula [C15H13O6]+ . COCONUT record includes a reference to knapsack (http://www.knapsackfamily.com/knapsack_core/information.php?mode=r&word=C00020650) where the same compound is represented by SMILES c1(ccc2c(c1O)[o+]c(c(c2)O)c1cc(c(cc1)O)O)O and formula C15H11O6. PubChem record for melacacinidin (https://pubchem.ncbi.nlm.nih.gov/compound/85930794) also has formula [C15H11O6+] and SMILES C1=CC(=C(C=C1C2=C(C=C3C=CC(=C(C3=[O+]2)O)O)O)O)O I think the problem is in the representation of the flavonoid core structure in COCNUT where it contains -[OH+]- in the ring instead of -[O+]= (as in other databases) which leads to two extra hydrogen atoms.

This is a single example, but the same issue is present in dozens if not hundreds of flavonoid entries

steinbeck commented 1 year ago

Thanks for the helpful report! We have already undergone a cleanup of COCONUT data and will soon release this, together with a brand-new user interface. We will now check if your issue was resolved with the data cleanup; if not, we will ensure to fix things. Thanks again! Reports like this make COCONUT a lot better and are very much appreciated.