phenopackets / phenopacket-format

26 stars 10 forks source link

Suggestion for output format #21

Open pnrobinson opened 8 years ago

pnrobinson commented 8 years ago

Hi Tudor, as we are discussing right now here are some suggestions

  1. Allow users to upload either word or to paste in text into a window.
  2. After initial text mining, it is an issue that a paper may describe two or more patients. It would be useful to find a way for users to assign mined HPO terms to individual patients. One simple thing is to allow a user to mark part of the text that pertains to a single patient. Or allow users to enter the name of all patients described in an article and have the GUI present a table like this

     * patient ID 1 * patient ID 2 * patients ID 3

    HP1 * x * * x
    HP2 * x * x * x
    HP3 * * * x

etc

each HP is the ID and prefered name of one of the mined terms. It should be possible to delete entire rows if the HPO term was a false positive.

cmungall commented 8 years ago

We should have a ticket dedicated to text mining activities. Should I move this to the monarch-phenote tracker for now?

The grid+checkbox thing is a good idea, we would like to have this more generally, e.g. genes rather than patients, and any kind of functional or phenotypic assignment as columns, do we plan to do something like this for either tpc or seeding initialization @kltm @hdietze?

cmungall commented 8 years ago

spoke to @pnrobinson , the phenpacket requirement here is the ability to represent this faithfully:

http://www.cell.com/ajhg/fulltext/S0002-9297%2810%2900519-7