obophenotype / human-phenotype-ontology

Ontology for the description of human clinical features
http://obophenotype.github.io/human-phenotype-ontology/
Other
293 stars 51 forks source link

question re: logical def for children of HP_0012379 'Abnormal enzyme/coenzyme activity'. #4924

Closed nicolevasilevsky closed 4 years ago

nicolevasilevsky commented 5 years ago

In HPO, there are terms that are children of HP_0012379 'Abnormal enzyme/coenzyme activity'.

A lot of them are not logically defined (but probably could be) and a lot of them do not conform to the abnormallyDecreasedRateOfBiologicalProcess (or increased) pattern.

Some use the quality ‘increased amount’ (or ‘decreased amount’) or ‘decreased process quality’.

Should we address this?

Some examples: HP_0003155 'Elevated alkaline phosphatase' 'has part' some ('increased amount' and ('inheres in' some 'alkaline phosphatase activity') and ('has modifier' some abnormal))

HP_0008166 'Decreased beta-galactosidase activity' 'has part' some ('decreased process quality' and ('inheres in' some 'beta-galactosidase activity') and ('has modifier' some abnormal))

HP_0003534 'Reduced xanthine dehydrogenase activity' 'has part' some ('decreased rate' and ('inheres in' some 'xanthine dehydrogenase activity') and ('has modifier' some abnormal))

cc @matentzn

drseb commented 5 years ago

I don't think the def for Elevated alkaline phosphatase is really correct. I think it refers to a higher concentration of AP and not directly activity.

LCCarmody commented 5 years ago

What is measured is the amount of alk phos. Maybe what is incorrect is the 'alkaline phosphatase activity'? https://www.healthline.com/health/alp#uses

pnrobinson commented 5 years ago

It is possible to measure enzyme activity or protein level. I think that the general test is usually enzyme activity, see e.g. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4062654/. If you are testing for the isozymes, then I think it is probably more correct to state levels.

nicolevasilevsky commented 5 years ago

1) Can we revise all the subclasses of 'Abnormality of alkaline phosphatase activity' to include 'activity' in the name? 2) Can we revise all the logical definitions to use GO 'alkaline phosphatase activity'? For example, these terms would be revised (there are more than just these, but they would follow the same pattern):

HP_0010681 'Elevated intestinal alkaline phosphatase activity' 'has part' some ('increased process quality' and ('inheres in' some ('alkaline phosphatase activity' and ('occurs in' some intestine))) and ('has modifier' some abnormal))

HP_0008318 'Elevated leukocyte alkaline phosphatase activity' 'has part' some ('increased process quality' and ('inheres in' some ('alkaline phosphatase activity' and ('occurs in' some leukocyte))) and ('has modifier' some abnormal))

drseb commented 4 years ago

I think the test measure the level in the patients blood. Not sure why you want to put activity here - it is an inference that you can make, but why would you want to do this? I don't think we should change logical definitions so that they fit a desired DOSDP-template.

matentzn commented 4 years ago

I don't think we should change logical definitions so that they fit a desired DOSDP-template

Definitely not! Key being the phrase "a desired" here - it should, however, fit a pattern (if it does not exist, it will be created). For complex phenotypes, where no pattern can be defined, it is important to not create an EQ at all to preclude bad inferences.

I think the more important question here is the subclass question. If I understand you correctly, it may not always be the case that 'activity' is implied - therefore the subsumption itself may need to be looked at. If you are a subclass of Abnormality of alkaline phosphatase activity, this means, that, in all cases you are an abnormality of the alkaline phosphatase activity - no matter the label.

The other question that @nicolevasilevsky poses is to simply unify the use of increased/rate vs increased amount where appropriate - and we so far decided not to use increased amount in conjunction with processes or molecular function (either increased rate, or increased occurrence).

drseb commented 4 years ago

I am pretty sure the lab test measures the concentration. The activity is a valid inference - if the inference holds in all circumstances in all possible universes - I don't know.

pnrobinson commented 4 years ago

I think that for practical purposes, most lab tests for alk phos measure activity. To measure the isoforms, one can use electrophoresis. https://www.ncbi.nlm.nih.gov/books/NBK459201/

For the interpretation within HPO, it does not make much difference how alk phos is measured, and we are really interested simply in "amount".

In any case, the reason for increased or decreased activity is that there is a higher or lower concentration of AP. 'increased process quality' sounds like the "activity" itself is higher, i.e., the reaction rate is abnormally high or low per molecule of alk phos. This is not true.

Possibly it would be better to simply remove the word "activity" from these terms. On the other hand, many other enzymes have low or high concentrations (or activities depending on how they are measured).

matentzn commented 4 years ago

So from what you say you would prefer the

increased amount of alkaline phosphatase (increasedLevelOfChemicalEntity)

pattern over the

increased occurrence of alkaline phosphatase activity (increasedRateOfMolecularFunction)?

drseb commented 4 years ago

I vote for increased amount. (but consistency with MP is more important) (Also, we have usually not used "increased occurence", right? We have used "increased process quality" (which is probably not optimal) - but I assume you are aware of this. )

matentzn commented 4 years ago

OK risking to be annoying but wanting this to be nailed before aligning with MP: @drseb you vote for "increased amount of ### chemical" over "increased occurrence of ### activity of chemical", right?

mellybelly commented 4 years ago

ALP dephosphorylates things, it is provided a substrate to evaluate ALP activity. This is not an increase in concentration of ALP, such tests usually measure activity; though as peter suggests, this is a proxy for "amount". However, you could have more or less concentration of ALP but have the same level of activity, if for some reason the ALP was mutated or otherwise affected by other variables. Also note that ALP is not a chemical, it is a protein enzyme. More info: https://loinc.org/6768-6/ https://en.wikipedia.org/wiki/Alkaline_phosphatase

I think we need a pattern that is a measure of enzyme activity, in addition to one with chemical concentration. Could do amount of enzyme, but i would not call enzymes "chemical entities".

i don't know if i have helped decide, but as someone who has done a lot of ALP assays I had to say something and have hopefully at least helped clarify the biology :-)

matentzn commented 4 years ago

I think we need a pattern that is a measure of enzyme activity, in addition to one with chemical concentration.

So two different terms in HPO?

Could do amount of enzyme, but i would not call enzymes "chemical entities".

I think we kind of agreed to just call everything that is a molecular entity a chemical entity, because confusingly, in the CHEBI world chemical entity is the most general term. But I added your concern to the next phenocall so someone can teach me what this exactly means for our interpretation of the term chemical entity..

mellybelly commented 4 years ago

I don't think of proteins as chemical entities, i don't think we do in Biolink either: https://biolink.github.io/biolink-model/docs/ChemicalSubstance.html

yes might need two diff patterns- one for enzyme activity and one for measured amounts (of chemical or protein/enzyme, etc.)

pnrobinson commented 4 years ago

There may be some exceptions, but for the vast majority of cases, the activity per molecule of ALP is identical, and the difference comes from the concentration of molecules per volume!

balhoff commented 4 years ago

In the CHEBI hierarchy, information biomacromolecules (proteins and DNA) fall under chemical entity: http://www.ontobee.org/ontology/CHEBI?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FCHEBI_33695

mellybelly commented 4 years ago

my point isn't so much about ALP per se, but rather that we do need patterns for uPheno that can support both amounts of chemical entities and activities of biomolecules (e.g. proteins/enzymes). Also Chebi labels may not be applicable across fields- most biologists don't think of proteins as "chemical entities".

matentzn commented 4 years ago

Just to be clear - patterns exist for both cases. The problem here is whether a specific term in HPO is intended to reflect the one situation or the other; I will ask at OBOCORE for an opinion on what ontology concept should be used for encapsulating absolutely everything between macromolecules and Ions.

matentzn commented 4 years ago

Actually, when OBOCORE is accepted, we will just use http://purl.obolibrary.org/obo/COB_0000013, which is "molecular entity" and seems to be defined as A material entity that consists of two or more atoms that are all connected via covalent bonds such that any atom can be transitively connected with any other atom. it is a bit annoying that we have started nameing the pattern according to the CHEBI conventions - it will take quite a bit of coordination to rename them all to use MolecularEntity instead..

pnrobinson commented 4 years ago

We should consider separating the two concepts actually, it does not make a lot of sense.

matentzn commented 4 years ago

If you are talking about the abnormal activity vs abnormal amount of molecular entity I agree!

sbello commented 4 years ago

Nico asked for examples on the pheno-editors call. In MP we have distinct sub-branches for: abnormal enzyme/coenzyme level http://purl.obolibrary.org/obo/MP_0005319 abnormal enzyme/coenzyme activity http://purl.obolibrary.org/obo/MP_0005584 level is a child of abnormal homeostasis and activity is a child of abnormal metabolism

pnrobinson commented 4 years ago

There are some enyzmes where activity results from two proteins. Probably we do need to keep this distinction for that reason. Can we figure out how to close this issue?

matentzn commented 4 years ago

Do you want to go the MP route and create terms for both? We can assume that the ones that now have "activity" in the label refer to abnormal activity and use the respective abnormal activity pattern. For each of the activity terms we create a separate "abnormal level" term?

drseb commented 4 years ago

Lets go MP route

pnrobinson commented 4 years ago

I am not sure this is that useful because there are standard ways that things get measured in the clinic. Also, for something like alk phos, the activity test measures the activity of any of the alk phos enzymes, but say an ELISA type test would be specific. For downstream algorithmic use (e.g. Phenomizer), these subtleties do not matter imho

matentzn commented 4 years ago

So what is your suggestion then? Just keep the distinction, and decide on a case by case basis whether it is the one other the other? Wont we lose quite a bit of semantic similarity unintentionally like that?

pnrobinson commented 4 years ago

This is an area of the graph where semantic similarity doesnt make any sense. Having increased protein X usually has nothing to do with increased protein Y. Probably we should do something about this algorithmically actually.

matentzn commented 4 years ago

It could mean something though for MP-HP mappings though, and there are use cases where even increased protein X and increased protein Y could mean something - depending on how similar the proteins are? For grouping phenotypes ("give me all the genes that are associated with increased protein production") it makes sense to use as consistent EQs as possible, even if the grouping does not help with clinical diagnostics.

pnrobinson commented 4 years ago

possibly, but it does not make sense to make more terms in HPO that would never be used because the proteins are never measured in that way.

nicolevasilevsky commented 4 years ago

Hi all - I'm a bit unclear what the action item is?

One open question is- can I make the logical defs for the children of this term consistent: 'Abnormality of alkaline phosphatase activity'

'has part' some ('process quality' and ('inheres in' some 'alkaline phosphatase activity') and ('has modifier' some abnormal))

Can we use this pattern above for the children of this term?

nicolevasilevsky commented 4 years ago

In HPO, it seems there is a mix of terms underneath 'Abnormal enzyme/coenzyme activity'

For example, 'Abnormal aldolase level' is a child of 'Abnormal enzyme/coenzyme activity'. I think the majority of subclasses of 'Abnormal enzyme/coenzyme activity' are activity terms though.

I am

matentzn commented 4 years ago

If we go the - "lets add no terms and have both patterns" route, abnormal activity terms and abnormal level terms cannot be connected by isa (subclassof). We will need a new relation connecting the two - maybe something causal - but definietly something else. So if we go with @pnrobinson suggestion, 1) we will have the terms that explicitly refer to activity on the one side using the abnormalBiologicalProcessPattern, 2) the ones that explicitly refer to levels using the abnormalLevel... patterns 3) Every term refers to either 1), 2) or no pattern at all 4) No subclass relations exist between 1 and 2 (i.e. they need to be removed / replaced by some other relation)

So the action items are: A) Identify activity terms and make sure they confirm to abnoromalBiologicalProcess patterns B) Identify level terms and make sure they conform to abnormalLevelPatterns.. C) Break isa-links between the two

@nicolevasilevsky maybe it would help that while you do it, keep some notes somewhere that says "changed this that because"..

drseb commented 4 years ago

I disagree that these subtleties do not matter. Also, for semantic similarity, those would have the common parent "abn. of AP". Using a "possibly causal"-relationship sounds okay to me.

pnrobinson commented 4 years ago

I think I was unclear. I do not think the medical interpretation matters for Increased activity of X vs Increased level of X (where X is an enzyme). This is almost always measuring the same thing in two different ways. Obviously it matters that they have the same parent. Can we try to work on one or two examples? AP is good because it is really difficult to model.

matentzn commented 4 years ago

Unfortunately @drseb's idea of having both subsume under Abnormal alkaline phosphatase won't work easily because:

  1. we are strictly separating material entity and processual phenotypes at the moment in the EQ framework (knowing this is, of course, violated in literally all ontologies), with no way to group them (conservative approach).
  2. GO does not maintain any connection between the activity and the alkaline phosphatase itself; this means that even if we chose to allow grouping them (which we are currently discussing in the group, and I put it on agenda again).

However, once they are causally related, the algorithms can choose to take indicative of or other causal relations into account the same way they do isa -> so no further integration would be needed (in particular no common parents). So in this case, one would be a parent of the other (and in some cases vice versa).

pnrobinson commented 4 years ago

Can we please skype to talk about this. We should first revisit the term labels, some of which come from 2008. Even though some of them may say level, this is a colloquialism for enzyme activity. Please see this page for an example https://labtestsonline.org/tests/alkaline-phosphatase-alp (The main page says level, but if you look at the LOINC codes, clearly they mean activity).

nicolevasilevsky commented 4 years ago

I am working with Kristen to get a meeting scheduled.

dosumis commented 4 years ago

Wow this ticket is long.

Some thoughts.

  1. We can potentially use GO MF terms for both activity and level/amount phenotypes1: abnormal alkaline phosphatase activity: ... inheres_in some GO:'alkaline phosphatase activity' abnormal alkaline phosphatase levels: ... inheres_in some ('protein'2 that capable_of some GO:'alkaline phosphatase activity'). This means we can take advantage of the GO MF hierarchy for grouping consistently in both cases

  2. I like Nico's causal link suggestion. While activity and levels can vary independently, increased levels will typically increase activity (and as Peter says, measurement of activity is often used as a proxy for measuring amount), so grouping level terms under activity terms might be a useful approximation if we have both. This looks to me similar to the situation with mixed development/morphology phenotype hierarchies - there is a causal link between two phenotypes that holds enough of the time that we want to group in the same hierarchy. GO uses this logic in GO-CAM models - treating regulation of levels of a enzyme as regulation of its activity. It shouldn't be too hard to come up with a GCI clause to add to patterns to support this link. We can add a materialization step to the release pipeline to make these explicit in the release files.

e.g. to infer 'increased AP levels' results_in 'increased AP activity' we could use this GCI (or subClassof axiom) to link patterns for the two terms:

GCI: ('has part' some 'increased amount' and ('inheres in' some 'protein that capable_of some '%s') SubClassOf results_in some ('has part' some ('process quality' and ('inheres in' some '%s') and ('has modifier' some abnormal))) vars: - enzyme_activity
- enzyme_activity

*1 Nico wrote: "GO does not maintain any connection between the activity and the alkaline phosphatase itself." The link is maintained in GO annotation - not the ontology *2 Might be too specific

drseb commented 4 years ago

Thanks. We have thought about causal links for a long time. Would be cool to finally do this. I would just want to make sure that causality is often not strictly given, and would suggest to use rather "possibly_results_in" than "results_in"

nicolevasilevsky commented 4 years ago

I talked to @pnrobinson and @drseb today and we made decisions. I'll work on these on a branch:

nicolevasilevsky commented 4 years ago

@pnrobinson For terms like HP_0008318 'Elevated leukocyte alkaline phosphatase' and HP_0010684 'Low alkaline phosphatase of bone origin', do you think it would be okay to use PR:000023540 alkaline phosphatase in the logical definitions?

The text def of the PRO term is: A protein that is a translation product of the Escherichia coli K-12 phoA gene or a 1:1 ortholog thereof. [ PRO:DNx ]

(Would the human form be an ortholog of the E.Coli protein?)

dosumis commented 4 years ago

(Would the human form be an ortholog of the E.Coli protein?)

Sounds like a stretch. What about using the GO pattern above instead: 'protein' that capable_of some GO:'alkaline phosphatase activity' ?

nicolevasilevsky commented 4 years ago

I like that, thanks @dosumis!