NCATS-Tangerine / translator-knowledge-graph

Prototype of Translator-wide shared central knowledge graph. (Obsolete as of 2019)
MIT License
6 stars 0 forks source link

ingest FAERS for hackathon for KS #4

Open realmarcin opened 6 years ago

mbrush commented 6 years ago

Did we want to ingest the raw, "observation-level" FAERS data, or the analyzed associations dervied from this data in the AEOLUS dataset? See https://github.com/NCATS-Tangerine/ncats-ingest/issues/12.

AEOLUS Publication: http://www.nature.com/articles/sdata201626 Dryad data download: http://datadryad.org/resource/doi:10.5061/dryad.8q0s4

andrewsu commented 6 years ago

Also just a note that AEOLUS is available in mychem.info. For example:

AEOLUS data for tetrazepam: http://mychem.info/v1/chem/IQWYAQCHYZHJOS-UHFFFAOYSA-N?fields=aeolus All compounds with AEOLUS data: http://mychem.info/v1/query?q=_exists_:aeolus

cmungall commented 6 years ago

+1 to mychem

Should we normalize meddra to HP and MONDO ahead of time, or post ingest into graph, using the scigraph clique merge?

Have we decided on primary IDs for chemicals in the tkg? I suggest following mychem and treating inchikey as primary key and merge everything to that

cmungall commented 6 years ago

@andrewsu @stuppie looks like the source is CC-0 https://datadryad.org//resource/doi:10.5061/dryad.8q0s4 - is this going into WD? That's an easier target since we'll have a generic WD import (https://github.com/NCATS-Tangerine/kgx/issues/6) and WD will do a lot of the normalization for us

realmarcin commented 6 years ago

I really can speak only to the original intent for the ticket, but the idea was to have access to FDA 'adverse' event data. There was an intuition that this includes both positive and negative events, and positive events could be used to mine for potential novel or repurposed drug effects. Just looking at the AEOLUS paper abstract it seems that data is more directly useful, and it sounds like the raw FAERS data would require some NLP/text mining etc.

On Tue, Apr 24, 2018 at 6:42 AM, Matthew Brush notifications@github.com wrote:

Did we want to ingest the raw, "observation-level" FAERS data, or the analyzed associations dervied from this data in the AEOLUS dataset? See NCATS-Tangerine/ncats-ingest#12 https://github.com/NCATS-Tangerine/ncats-ingest/issues/12.

AEOLUS Publication: http://www.nature.com/articles/sdata201626 Dryad data download: http://datadryad.org/resource/doi:10.5061/dryad.8q0s4

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/NCATS-Tangerine/translator-knowledge-graph/issues/4#issuecomment-383935231, or mute the thread https://github.com/notifications/unsubscribe-auth/AEaVziXbjW9GobsxCfILchy6GPrRGhvaks5tryvggaJpZM4Tam7i .

stuppie commented 6 years ago

The issue with this going into WD is that it is very messy and would need work.. See here for description of messiness: https://github.com/biothings/mychem.info/issues/6 Wikidatans will probably complain if we assert that "Depression" is an indication for imatinib, etc.