Open dhimmel opened 5 years ago
For background reading:
Heterogeneous Network Edge Prediction: A Data Integration Approach to Prioritize Disease-Associated Genes
Daniel S. Himmelstein, Sergio E. Baranzini
PLOS Computational Biology (2015-07-09) https://doi.org/98q
DOI: 10.1371/journal.pcbi.1004259 · PMID: 26158728 · PMCID: PMC4497619
Systematic integration of biomedical knowledge prioritizes drugs for repurposing
Daniel Scott Himmelstein, Antoine Lizee, Christine Hessler, Leo Brueggeman, Sabrina L Chen, Dexter Hadley, Ari Green, Pouya Khankhanian, Sergio E Baranzini
eLife (2017-09-22) https://doi.org/cdfk
DOI: 10.7554/elife.26726 · PMID: 28936969 · PMCID: PMC5640425
@ben-heil and I are meeting presently to discuss potential projects for his rotation in the @greenelab. One thing that will be important for the search engine we're building, where users select a node pair and we identify paths that occur more frequently than is expected by chance, is to identify not just metapaths, but also specific paths and intermediate nodes that are relevant.
For example, the hetmech-backend database, which is still populated, returns the following most-significant metapaths between the gene FTO and disease obesity:
The challenge is to further decompose these DWPCs into individual path scores. Once that is accomplished, and path scores can be compared across metapaths, we can even aggregate scores by intermediate nodes, as we have briefly explored previously in Decomposing the DWPC to assess intermediate node or edge contributions.
So the main tasks here would seem to be: