phenopackets / phenopacket-format

26 stars 10 forks source link

Ensure molecular phenotypes are representable #51

Open cmungall opened 8 years ago

cmungall commented 8 years ago

Things like the impact of a mutation on the affected protein's Molecular Function

drseb commented 8 years ago

that would be a termrequest to hpo, right? or do you plan on mixing ontologies inside a phenopacket?

cmungall commented 8 years ago

If a mutation in gene A affected expression of gene B, we wouldn't want to precompose "downregulation of B" in HPO, we want the formalism to be able to compose these expressions clearly

pnrobinson commented 8 years ago

Molecular phenotype generally refers to concentrations/levels/amounts of proteins, mRNAs, or metabolites. The HPO to date has some terms for molecular phenotypes for items that are measured in clinical labs. However, a molecular phenotype can in principle refer to the level of any molecular in the cell, i.e., can be measured by RNA-seq or proteomics. I would be a little cautious about starting to add HPO terms for this, since it would lead to an explosion of terms. Generally, if something from a "molecular phenotype" is clinically useful, people will start testing for it in a molecular lab. Chris, do you have sme examples of what you are thinking about?

pnrobinson commented 8 years ago

Another thing to do would be to allow a phenopacket to be associated with things like MIAME compliant data.To be honest I am a little worried about mission creep and think we should concentrate on a smallish core of items and then extend it to version 2 according to what users want.

cmungall commented 8 years ago

Point taken about scope creep, but remember the format isn't just for human clinical data, this is a common scenario for model data. The use case here is just to be able to say things like "increased expression of X" not sure linking out to MIAME compliant data would help here

pnrobinson commented 8 years ago

Although in some cases, "increased expression of X" might be an HP or MP or P term, probably then it is better to (also) allow people to use something like ChEBI, PO, and etc. There are *lots of things that would come into question out in the wild. I suspect that saying "out of scope for version 1" might be better than providing a partial solution

drseb commented 8 years ago

I see. But I think it is harder to keep control once you allow for post-composed descriptions. Assume tomorrow somebody describes increased activity of GENE:X and this gets stored somewhere externally. Then a year later GENE:X is obsoleted or we suddenly want to use a better ontology of genes/gene products. It would be easier to fix the references to GENE:X in HP/MP's logical defs then in the post-composed description, I think.

mbrush commented 8 years ago

Some of the issues in the the google doc below are relevant here The doc compares different approaches for representing G2P associations involving molecular phenotypes: https://docs.google.com/document/d/1Aaas-Eip5JCgPQ7taFR62Cj3GlF8oL4n6dQhvxTX3qg/edit#