monarch-initiative / monarch-legacy

Monarch web application and API
BSD 3-Clause "New" or "Revised" License
42 stars 37 forks source link

use annotation sufficiency scoring to improve HPO annotations to human diseases #301

Closed mellybelly closed 8 years ago

mellybelly commented 10 years ago

It would be great if we can generally improve our disease annotations where we know they don't have the best annotation sufficiency scores. @nlwashington can we generate a report?

pnrobinson commented 10 years ago

Hi everybody,

I think it would be great to think a little harder about how we can improve the annotation process. Especially as we (want to) move also into common diseases, I think we will need to be a lot more sophisiticated as to what we can extract from pubmed and similar sources. I would envisage a software that would basically let you drop a bunch of texts (minimum: set of pubmed abstracts, maximum: PDF articles), and would mark up the texts, do some statistics, and let a human being postprocess the results. I have been speaking with Mike Brudno about this topic, and there is some probably small chance we might get a bit of funding. But I think that this is actually turning into a really interesting research/engineering-research topic, perhaps even something I might apply for a German grant for (I would much prefer to do this with the Monarch group, since most people in this crowd know more than I do about this sort of thing... :-0). There is a group in Spain that is doing really cool stuff, I have know the PI for 5 years and would also ask her about this.

To start off it might be good to try a few examples by hand. i.e., how much better can we get the score if we spend an entire afternoon doing biocuration? That would presumably be the upper limit in the amount of improvement we could get computationally.

-Peter

Dr. med. Peter N. Robinson, MSc. Professor of Medical Genomics Professor in the Bioinformatics Division of the Department of Mathematics and Computer Science of the Freie Universität Berlin Institut für Medizinische Genetik und Humangenetik Charité - Universitätsmedizin Berlin Augustenburger Platz 1 13353 Berlin Germany +4930 450566006 Mobile: 0160 93769872 peter.robinson@charite.de http://compbio.charite.de http://www.human-phenotype-ontology.org Introduction to Bio-Ontologies: http://www.crcpress.com/product/isbn/9781439836651 I have learned from my mistakes, and I am sure I can repeat them exactly ORCID ID:http://orcid.org/0000-0002-0736-9199 Scopus Author ID 7403719646 Appointment request: http://doodle.com/pnrobinson


Von: Melissa Haendel [notifications@github.com] Gesendet: Freitag, 25. April 2014 17:27 An: monarch-initiative/monarch-app Cc: Robinson, Peter Betreff: [monarch-app] use annotation sufficiency scoring to improve HPO annotations to human diseases (#301)

It would be great if we can generally improve our disease annotations where we know they don't have the best annotation sufficiency scores. @nlwashingtonhttps://github.com/nlwashington can we generate a report?

— Reply to this email directly or view it on GitHubhttps://github.com/monarch-initiative/monarch-app/issues/301.

jmcmurry commented 8 years ago

I'm not sure what the status of the application was a year and a half ago, however, we have now incorporated annotation sufficiency so I believe that this issue can be closed. Please confirm.

nlwashington commented 8 years ago

while we have not really used the scoring to assist with curation, we could. but this belongs more in the longer-term planning for Monarch, rather than in our app.