monarch-initiative / monarch-legacy

Monarch web application and API
BSD 3-Clause "New" or "Revised" License
42 stars 37 forks source link

Visualizing uncertainty, ranking, provenance #1377

Open jmcmurry opened 7 years ago

jmcmurry commented 7 years ago

Spoiler alert: This. Is. Hard. But it is also incredibly important to making a real difference to our being eventually being operationalized in the clinic.

Collectively in bioinformatics we need to do a much better job of transparency about uncertainty. This is especially difficult for us in Monarch because we are rolling up from data sources and points whose uncertainty is not characterized at all. This is just a high level ticket to get the conversation rolling. The most obvious places where we Monarch need most to do uncertainty modeling are any results backed by:

1) owlsim/phenodigm/exomiser 2) text mining 3) disease groupings (?)

There's also the whole other world of factoring in the reproducibility of any given result we rely on to train our models (eg. the difference between a gene-to-phenotype association that was published 5 times versus once 15 years ago in a paper that no one has referenced since).

No manifesto here, just wanted to get it written down before I forget. This post brought to you by election season where the visuals are nice, even if the debates aren't.

screen shot 2016-10-20 at 1 31 55 pm
harryhoch commented 7 years ago

+1. Uncertainty is a hard problem everywhere. Provenance and transparency of methods are definitely key factors in this question. I can provide pointers on visualization of uncertainty if you're interested, and would be happy to discuss...

but yes, it's hard..