Sage-Bionetworks / synapseAnnotations

Sage Bionetworks derived standards for annotating content in Synapse.
MIT License
12 stars 21 forks source link

What are recommended sources for *all drug phases #235

Closed teslajoy closed 6 years ago

teslajoy commented 6 years ago

This topic initiated from #231

ex. https://pubchem.ncbi.nlm.nih.gov/compound/clozapine

kdaily commented 6 years ago

Come up with no more than 3 potentials to try in order, else stop and put source=Sage.

kdaily commented 6 years ago

Check potential sources in EBI OLS first, then go from there.

allaway commented 6 years ago

Following up on @kdaily:

3 chemical databases i'd recommend: NIH PubChem: https://pubchem.ncbi.nlm.nih.gov/compound/2818 RSC ChemSpider: http://www.chemspider.com/Chemical-Structure.10442628.html?rid=a247ce03-cc7b-4e99-a46c-7fa932730468 NIST WebBook: http://webbook.nist.gov/cgi/cbook.cgi?Name=clozapine&Units=SI

teslajoy commented 6 years ago

@allaway is the king of drug definitions. 👍 on my behalf.

kdaily commented 6 years ago

Who else uses drugs and wants to comment on that?

kdaily commented 6 years ago

If no one disagrees on the sources or their ordering, this should be documented in the CONTRIBUTING.md file (yet to be created - this could be the first entry!)

allaway commented 6 years ago

One more good resource that is more specifically drug-focused and less chemical-focused is DrugBank: https://www.drugbank.ca/drugs/DB00363

I've always perceived DrugBank to be more closely curated, but as a result it has fewer molecules.

teslajoy commented 6 years ago

For population variants/drug association https://www.pharmgkb.org/ Example: https://www.pharmgkb.org/search?connections&gaSearch=clozapine&query=clozapine

Full list of NIH funded databases https://epi.grants.cancer.gov/pharm/gen-resources.html

allaway commented 6 years ago

@teslajoy @kdaily circling back on this as I am pulling some drug name descriptions for annotations and finding that PubChem and ChemSpider - while great sources of chemical data - aren't a good source for human-readable descriptions of molecules. My strategy now is to use OLS first (as with before), and then use MeSH as a fallback. eg: https://www.ncbi.nlm.nih.gov/mesh/67585785

I'm finding that OLS works better than expected - about 70-80% of drug queries have an OLS hit that is satisfactory, typically from NCIT.

For molecules that have been just discovered or have been minimally investigated, typically the only good source is the original publication.

kkdang commented 6 years ago

@allaway Please summarize this in the new contributing doc that @kdaily is creating.

allaway commented 6 years ago

Here is the summary. Ready to drop it into the doc template whenever it's ready (@kdaily):

The preferred first-pass strategy for chemical name annotation is to search the EMBL-EBI ontology lookup service to find names, descriptions, and sources. Typically, the NCI Thesaurus will provide a suitable description for drugs and other biologically active molecules. In situations where the query molecule is not found in EMBL-EBI OLS, MeSH (https://meshb.nlm.nih.gov/) is a helpful secondary location to find chemical descriptions.

Example:

{
        "value": "DEFACTINIB",
        "description": "An orally bioavailable, small-molecule focal adhesion kinase (FAK) inhibitor with potential antiangiogenic and antineoplastic activities.",
        "source": "http://purl.obolibrary.org/obo/NCIT_C79809"
},

In situations where novel molecules (such as newly-synthesized research compounds or proprietary pharmaceutical molecules) require annotation, the only suitable description and source might be the paper describing the synthesis or discovery, or information from the pharmaceutical company that created the identifier.

Example:

{
        "value": "IPC-12345",
        "description": "An small-molecule target of importance 4 (TOI4) inhibitor with potential antineoplastic activities.",
        "source": "Important Pharma Company"
},
{
        "value": "BestChemist-00913",
        "description": "An investigational small molecule discovered by Best Chemist et al.",
        "source": "PubMed Link Goes Here"
},
allaway commented 6 years ago

It is unclear to me how this should be integrated into the contributing doc. Would like to discuss today @kdaily @sgosline

allaway commented 6 years ago

@kdaily @teslajoy This was finalized and is in master, so I think we probably don't need to discuss tomorrow. 💥