kids-first / kf-api-dataservice

:file_cabinet: Primary API for interacting with the Kids First data
http://kf-api-dataservice.kidsfirstdrc.org
Apache License 2.0
5 stars 3 forks source link

Create MVP ERD #43

Closed baileyckelly closed 6 years ago

baileyckelly commented 6 years ago

Note: This is due EOD Thursday 1/25/18

We need to create an ERD for OICR of the MVP entities

allisonheath commented 6 years ago

Initial pass: DataModelMVP-20180125.pdf DataModelMVP-20180125.xml.zip

For now, left off types, but in general should be straightforward. The idea is to first get current data in hand into as-is (no enums for example) and that will give a baseline of where to focus harmonization efforts.

For sure we'll be iterating on this, and ultimately #40 will have all of the details, but this is to give an initial goal to works towards. Feedback welcome! 😄

allisonheath commented 6 years ago

Also to further expand on the harmonization and the large arrows to the various ontologies and where we're generally headed at the moment:

For example, for phenotype right now we're just going to put the phenotypes we're directly getting to the investigators into the phenotype field as a string (e.g. WEBBED_NECK). Then we'll work on the analysis on how to harmonize those to HPO terms and likely create a HPO_ID field where that ID will be stored (e.g. HP:0000465) . The dataservice then can use that ID to materialize the standard term name out via the API, along with other features as needed such as synonyms, lay terms, hierarchies, xrefs as needed.

Similar concept for diagnosis and MONDO, anatomic site and UBERON, etc.

allisonheath commented 6 years ago

From conversations today, the only immediate change is adding tumor_descriptor to the sample entity. There were some suggestions on how to improve the family structure, but need to understand some more use cases to determine best path forward. Same for the study/dataset structure. Will open other tickets for those two. I think we can call this closed for the initial pass.