obophenotype / upheno

The Unified Phenotype Ontology (uPheno) integrates multiple phenotype ontologies into a unified cross-species phenotype ontology.
https://obophenotype.github.io/upheno/
Creative Commons Zero v1.0 Universal
75 stars 17 forks source link

Major: Migrate all patterns from using abnormal to using "changed" #760

Open matentzn opened 2 years ago

matentzn commented 2 years ago

Ontologies like MP, DPO, DDPHENO, FYPO, APO and HP will always curate their names themselves, but for XPO @seger @malcolmfisher103 @MardiNenni, ZP @ybradford, PLANP @srobb1 and PHIPO @jseager7 this ticket will have huge consequences.

There will be two major consequences of this migration.

  1. The modifier of the phenotype will change from "abnormal" to "changed"
  2. The default label template will change from "abnormal X" to something like "X" or "X phenotype", for example, "abnormally increased heart size" -> "increased heart size", and "abnormal heart size" -> "heart size phenotype".

The labelling won't affect MP, DPO, DDPHENO, WBPHENOTYPE, FYPO, APO, HP at all.. They will however affect the way uPheno grouping classes are named. HP will not be affected at all (not even in terms of the modifier), because for HP, abnormal literally means "clinically abnormal".

I am ready for your avalanche of questions about implications, and worries. We won't do anything at all unless I get all of you to agree on this major change.

For those that do not attend the biweekly phenotype call, this change accounts for the fact the presence of a phenotype term for example in conjunction with a gene in 99% of all cases we have discussed means: "when ETC is knocked out, there is a significantly increased heart size (but potentially in the range of normal)", and not "the heart size is (clinically) abnormal".

I am convinced this will get us closer to saner treatment of phenotypes across the board - leaving abnormal behind will clarify many of our discussions moving forward.

srobb1 commented 2 years ago

@matentzn What do I need to do to be prepared for this change? I am 100% on board with the proposed changes.

pfey03 commented 2 years ago

@matentzn I also have nothing against the changes and glad I have 'patterned' a lot but didn't put them on Git yet. Will then start with that when patterns have updated and continue on my google tables for time being.

matentzn commented 2 years ago

I want to clarify some things, and I am rewriting a comment I made in a discussion with @sbello on slack here (self plagiarism).

First of all, it is important to note that “abnormal” is not going away, and it will still be part of our phenotype modelling framework. The abnormal patterns will still stay and be available for use, but we ask here that if you chose to continue to use abnormal (all phenotype ontologies), you are saying that "the presence of this term in my ontology means that a change occurred that is outside the normal range". If you are certain this is what your term means, by any means, keep it as abnormal.

The main problem is that the whole debate on normal and abnormal has hold us back for so long to build useful infrastructure; curators keep saying stuff like “i cant use term X because the phenotype was not abnormal, it was within normal range but consistently elevated”. Remember, these are curators that read papers and try to find suitable ontology terms for what they are seeing, and the slightest discrepancy in meaning will cause them angst. From the conversation with many other groups individually, also as part of the trait-phenotype meeting which originally was actually about something else (measurement->phenotype) and evolved to the integration about traits, my impression was that if “increased heart size 0.2%, p=0.0005” people would curate an MP/DPO etc term, despite the fact that the new heart size is still within the normal range. However, in the clinical world, you would not do that: abnormal literally means: heart is larger than normal, i.e. outside the normal range. This is at least what I got from HPO (@drseb?). So phenotyping literally means “identifying observable characteristics outside the normal range”. Abnormal is defined in PATO literally as “deviation (from_normal)”.

Your worry (@sbello) about what that means for mappings is justified, but of course we have infrastructure in place to cope with that issue. First of all, abnormal changes fall under “phenotypic effects” and therefore all grouping use cases (like Alliance) will be unaffected. Similarly, semantic similarity will be mostly unaffected (slightly changed values, but nothing functionally different. So the only risk is “exact equivalence” - for that, we will continue to maintain the “phenotypic orthologue” relation, which is “quasi equivalent” - but note that the Alliance as far as I understand does not even care about this.

So the short answer to your question "how is such a major changed justified" is: we are progressing very slowly, and this continuous normal/abnormal debate across multiple channels diverts us from our real goal, which is connecting clinical and research data across phenotypes, and abnormal just does not reflect the reality of what people mean when they use a term. If it does though, this is no problem at all, because we will have both patterns in the uPheno repository, and it’s totally fine to define a term as abnormal, as long as we agree that this means “outside the normal range”. What this gives us is a clean story: “the presence of this term means that there was a change, potentially in the normal range, but consistent”.

mah11 commented 2 years ago

I have no objection to the big-picture changes here.

In FYPO, we don't make or need a distinction between "different from normal/wild type" and "abnormal", because we haven't seen such a distinction made in fission yeast research & publications (from what I recall of my budding yeast days, it's much the same there).

I want to home in on one thing I spotted in the original summary:

"abnormal heart size" -> "heart size phenotype"

This substitution doesn't look optimal, because a "phenotype" could be normal or changed-but-within-normal-range, i.e. "heart size phenotype" is broader than "abnormal heart size". I would especially not want to see "abnormal X" -> "X phenotype" rolled out all over the patterns, because that would clash with our stuff.

IOW, this isn't true in yeast research:

phenotyping literally means “identifying observable characteristics outside the normal range”

For us -- and also for budding yeast, IIRC from long ago -- phenotyping means assaying and documenting what the phenotype is for cells with a given genotype, and the answer can often be "the phenotype is normal". So we have a lot of "normal phenotype" terms in FYPO that get used when it's, well, interesting to document normalness, and we group "normal X" and "abnormal X" together under "X phenotype" (although for some specific Xs, it's implicit; we don't have every imaginable "X phenotype" grouping class). I expounded a bit in #758 ...

matentzn commented 2 years ago

@mah11 re your label substitution suggestion, I agree its subotimal. Given "increased heart size", maybe "affected heart size" or something like that instead? We are talking about "phenotypic effects" since about December 2020, maybe something in this direction would stick? Or "changed heart size"?

Your use of X phenotype will need some special care. In an ideal world, this would co-incide with out soon to be notion of "trait", so we have "abnormally increased cell count" --> "increased cell count" --> "changed cell count" --> "cell count (trait)". Would it be prudent to say that your "X phenotype" terms are basically trait terms with no inherent notion of an effect (increased, decreased, abnormal, changed)?

mah11 commented 2 years ago

Given "increased heart size", maybe "affected heart size" or something like that instead? ... Or "changed heart size"?

For hearts or other anatomical structures, I could see any of "affected", "altered", or "changed" working well enough. And I don't have a strong preference among them.

abnormally increased cell count" --> "increased cell count" --> "changed cell count" --> "cell count (trait)".

I will readily admit that I've long been a bit hazy on "trait" versus "phenotype" ... but that leaves me with no reason to think this wouldn't work.

Would it be prudent to say that your "X phenotype" terms are basically trait terms with no inherent notion of an effect (increased, decreased, abnormal, changed)?

Apart from the "I'm hazy" caveat above, I think so.

pnrobinson commented 2 years ago

Hi everybody, the word phenotype is used with different meanings (here is a commentary I wrote about this a while back: https://onlinelibrary.wiley.com/doi/10.1002/humu.22080).

The HPO actually describes phenotypic features (not synonymous with "phenotypes"). Some can be said be to outside of a normal range (e.g., blood sugar). Some people understand being with 2 or 3 standard deviations from the mean as "normal", and others understand a distinction between healthy and diseased; these two definitions often overlap but are rarely exactly the same thing. The HPO does not really take a stance about this, because it would be more philosophy than clinical utility. I would say that all HPO terms describe observations with potential clinical utility for the differential diagnosis. Probably 1-2% of the terms are not abnormal according to one of the above definitions. for instance, 'Curly hair' (understand: hair is much curlier than one would expect given the family -- this is a trait seen in some Mendelian disorders).

We will have to accept that 'abnormal' will mean different things for the different phenotype ontologies. I would be interested in hearing whether we have current or planned future use cases that would profit from a detailed modeling of these differences.

It would be most interesting for me to be able to navigate between (human and model) traits and abnormal human phenotypic features, please keep me in the loop about this and our team would be glad to contribute.

-Peter

Peter Robinson Professor and Donald A. Roux Chair, Genomics and Computational Biology The Jackson Laboratory for Genomic Medicine 860.837.2095 t | @.*** | https://robinsongroup.github.io/ Peter Robinson


From: Nico Matentzoglu @.> Sent: Friday, September 10, 2021 4:37 AM To: obophenotype/upheno @.> Cc: Subscribed @.***> Subject: [EXTERNAL]Re: [obophenotype/upheno] Major: Migrate all patterns from using abnormal to using "changed" (#760)

I want to clarify some things, and I am rewriting a comment I made in a discussion with @sbellohttps://github.com/sbello on slack here (self plagiarism).

First of all, it is important to note that “abnormal” is not going away, and it will still be part of our phenotype modelling framework. The abnormal patterns will still stay and be available for use, but we ask here that if you chose to continue to use abnormal (all phenotype ontologies), you are saying that "the presence of this term in my ontology means that a change occurred that is outside the normal range". If you are certain this is what your term means, by any means, keep it as abnormal.

The main problem is that the whole debate on normal and abnormal has hold us back for so long to build useful infrastructure; curators keep saying stuff like “i cant use term X because the phenotype was not abnormal, it was within normal range but consistently elevated”. Remember, these are curators that read papers and try to find suitable ontology terms for what they are seeing, and the slightest discrepancy in meaning will cause them angst. From the conversation with many other groups individually, also as part of the trait-phenotype meeting which originally was actually about something else (measurement->phenotype) and evolved to the integration about traits, my impression was that if “increased heart size 0.2%, p=0.0005” people would curate an MP/DPO etc term, despite the fact that the new heart size is still within the normal range. However, in the clinical world, you would not do that: abnormal literally means: heart is larger than normal, i.e. outside the normal range. This is at least what I got from HPO @.***https://github.com/drseb?). So phenotyping literally means “identifying observable characteristics outside the normal range”. Abnormal is defined in PATO literally as “deviation (from_normal)”.

Your worry @.***https://github.com/sbello) about what that means for mappings is justified, but of course we have infrastructure in place to cope with that issue. First of all, abnormal changes fall under “phenotypic effects” and therefore all grouping use cases (like Alliance) will be unaffected. Similarly, semantic similarity will be mostly unaffected (slightly changed values, but nothing functionally different. So the only risk is “exact equivalence” - for that, we will continue to maintain the “phenotypic orthologue” relation, which is “quasi equivalent” - but note that the Alliance as far as I understand does not even care about this.

So the short answer to your question "how is such a major changed justified" is: we are progressing very slowly, and this continuous normal/abnormal debate across multiple channels diverts us from our real goal, which is connecting clinical and research data across phenotypes, and abnormal just does not reflect the reality of what people mean when they use a term. If it does though, this is no problem at all, because we will have both patterns in the uPheno repository, and it’s totally fine to define a term as abnormal, as long as we agree that this means “outside the normal range”. What this gives us is a clean story: “the presence of this term means that there was a change, potentially in the normal range, but consistent”.

11:29 If you want we can meet between the three of us after ICBO and work through these issues together. You dont have to change anything in MP if you dont want to - all I am proposing is to accept the fact that abnormal means “outside the normal range” and move on.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/obophenotype/upheno/issues/760#issuecomment-916734279, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABFW4PB4JBSBUGI4GABJUTLUBG7TXANCNFSM5DXWYT6A. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.

Clare72 commented 2 years ago

Happy with this for dpo - phenotypes for us are basically always relative to a control, rather than some known normal range. Are the pattern iris going to change or will they still be called 'abnormal'?

matentzn commented 2 years ago

We wont change the abnormal patterns - we will just maintain a new shadow of patterns which are not abnormal, so the abnormal patterns will still work exactly the way they did..

Clare72 commented 2 years ago

so we will need to move all existing 'abnormal' phenotypes to new 'changed' patterns (where 'changed' is more appropriate)?

jseager7 commented 2 years ago

I think that PHIPO has already begun moving away from the normal/abnormal classification for pathogen-host interaction phenotypes, only retaining 'abnormal' for clearly abnormal single-species phenotypes, such as 'abnormal chromosome segregation' (see https://github.com/PHI-base/phipo/issues/309#issuecomment-772601271).

For the pathogen-host phenotypes, we are instead aiming to use 'presence of [phenotype]' or 'absence of [phenotype]' (as text labels). I think was this chosen because defining abnormality is difficult and inconsistent in a pathogen-host context: a 'normal' outcome of an interaction (i.e. infection and disease occurring) results in a normal phenotype for the pathogen but an abnormal phenotype for the host. Conversely, the host resisting infection will probably result in an abnormal phenotype for the pathogen but a normal phenotype for the host. Unfortunately, I don't think we can assume that presence or absence of a phenotype corresponds to the phenotype being normal or abnormal: sometimes it is, and sometimes it isn't. So the 'abnormal absence [of phenotype]' patterns don't always meet our needs as-is. If we had abnormality-agnostic patterns for presence and absence of phenotypes, I expect that would help us a lot.

@ValWood or @CuzickA might correct me on this: they've been more involved in the overall ontology design than I have.

ValWood commented 2 years ago

Thats a good summary, but the rationale for using 'present or absence' is not necessarily because it is "normal phenotype for the pathogen but an abnormal phenotype for the host"...it just doesn't make sense to classify these as normal or abnormal because there is no 'normal' in this context.

To illustrate, the same also this applies to pathogen or host only phenotypes. We can't define any particular quality of a variant (strain) as normal, they are all just "variations".

For fission yeast we have been able to call some phenotypes normal (like length of 14 microns), because everyone works on the same isogenic strain. However, some naturally occurring strains may be larger or smaller.

Notw that when we record a change 'increased or decreased' for PHIPO we also record the control strain that this observation is being compared to.