Replace the DOID references in EQ definitions

LCCarmody commented 5 years ago

Following 8 HP classes have DOID references:

DOID classes

[x] DOID_1240: HP_0001909 (Leukemia)
[x] DOID_169: HP_0100634 (Neuroendocrine neoplasm)
[x] DOID_305:
[x] HP_0011459 (Esophageal carcinoma),
[x] HP_0012114 (Endometrial carcinoma)
[x] HP_0030394 (Fallopian tube carcinoma)
- [x] HP_0012118 (Laryngeal carcinoma)
- [x] HP_0011763 (Pituitary carcinoma)
[x] DOID_3892: HP_0012197 (Insulinoma)

Please advise for replacement @pnrobinson @drseb @cmungall @matentzn

LCCarmody commented 5 years ago

Reference #4280

drseb commented 5 years ago

I would like to align with MP here. They also reference DOIDs. (e.g. Leukemia)

mellybelly commented 5 years ago

diseases should not be referenced in a phenotype ontology, imho. These items are all things that should come from NCIT.

drseb commented 5 years ago

Whatever we do, it should be in sync with MP

drseb commented 5 years ago

Also, we don't have NCIT-import, right!?

mellybelly commented 5 years ago

can we create the DOSDP needed for neoplasms? I thought some of these had been already created?

mellybelly commented 5 years ago

well we will need one ;-)

drseb commented 5 years ago

step 1: check with MP and other MODs what they think about usage of DOIDs. Hopefully we will all move to "XYZ_import"

step 2: create "XYZ_import.owl" & use it

step 3: be happy

mellybelly commented 5 years ago

Ok, but my STRONG recommendation is that phenotype ontologies remain independent of diseases, because they are used in disease-phenotype annotation it is counter-intuitive.

NCIT is the industry standard for neoplasms and should be used for atomic cancer entities.

drseb commented 5 years ago

I totally agree! But reasoning over MODs will not work, if we don't agree on shared imports

drseb commented 5 years ago

I could also live with removing those logical defs, because - with my HPO hat on - it is irrelevant for us. But - with uPheno hat on - it will lead to missed inferences for cross species reasoning

cmungall commented 5 years ago

Two things to bear in mind

the semantics of the MP class may be different from the HP class, even if they are labeled similarly. I feel we are still not all agreed on what an HPO neoplasm term means ontologically
we do have equivalence axioms NCIT-MONDO-DOID that could be imported, and a reasoner will use these. But this is also tied up with the question of ontological commitment

cc @matentzn @dosumis @balhoff

See this ticket: https://github.com/monarch-initiative/mondo/issues/461

drseb commented 5 years ago

Options I see then:

remove logical defs (add them back later... sure) (pros: very quickly implemented, cons: loose information)
create NCIT-import (who? how?) and change the logical defs to use NCIT (pros: maybe quickly implemented, cons: complexity of reasoning increases, divergence between species phenotype ontos)
implement community accepted revision (pros: well defined alignment between ontologies, cons: politics, will take very long until implemented)

Other options?

Opinions?

matentzn commented 5 years ago

I advocate this:

leave definitions for now, but present case to replace DOID with NCIt terms in the next Phenotype call
remove DOID import immediately (it has no affect on HP, and Upheno itself already has the required imports for cross species stuff). This relieves HP of the DO import issue, which Blocks the Progress to the new QC pipeline and ODK.
HP needs an NCIt Import in any case (given the many ncit terms used); the new ODK version will take care of that. You are right that reasoning complexity increases in theory; but since we all use elk, we don’t care. It’s just a bit incomplete (my old Manchester boss would be very disappointed in me now :P)
create a very basic neoplasm pattern. I don’t think it will take THAT long to be implemented and negotiatated across the community. I would like to give it a shot.

drseb commented 5 years ago

That sound good to me. Anybody in disagreement?

pnrobinson commented 5 years ago

1+

Also can we agree that the HPO cancer terms refer to the tumor (and not the entire disease) -- please create the patterns accordningly!

drseb commented 5 years ago

@cmungall agree?

anybody else?

If nobody shouts: @matentzn : go for it

cmungall commented 5 years ago

Note that https://github.com/obophenotype/upheno/blob/master/imports/mpath_import.owl already includes NCIT... a confusing situation sorry

drseb commented 5 years ago

all of NCIT?

@matentzn @cmungall should we try to separate these imports? I.e. can a general rule be, that no import contains other imports? Or would this be impossible or unrealistic?

matentzn commented 5 years ago

Yes, you are totally right, and it is not only realistic, it is necessary IMHO. We are in the process of making that the general rule across OBO ontologies. I expect this to be done by the the summer for all the phenotype ontology related dependecies..

We are trying hard to push base-releases now across the community, so we can import ontologies without their dependencies. I will start playing with direct NCIt imports in our HP migration pipeline, and see what comes up.

dosumis commented 5 years ago

Also can we agree that the HPO cancer terms refer to the tumor (and not the entire disease) -- please create the patterns accordingly!

If we do this, then these HPO terms are not phenotypes, they are physical entities. That may be OK, but we need to be clear about this.
I'm pretty sure the NCIT tumour terms are frequently used as if they refer to physical entities. The definition and naming is consistent with this - although classification does not make this clear.

As Chris says, we need a more global discussion of ontological commitment and linking patterns between phenotype and disease.

pnrobinson commented 5 years ago

HPO terms are not phenotypes, they are physical entities. That may be OK, but we need to be clear about this.=> There is no reason why a physical entity cannot be a phenotype, and about half of the HPO refers to some physical entity or another! I agree that currently the NCIT terms do not necessarily commit between phenotype or disease, which makes them difficult to use in the HPO context.

drseb commented 5 years ago

I think what is meant by "refer to the tumor" is that we want to say that the HPO terms are defined as an "abnormally increased number" of "the tumor" and that would still be a phenotype. We do not associate all the other phenotypes that are implicitly associated in some context with the finding of the tumor (downstream effects etc.)

dosumis commented 5 years ago

In the QE model of phenotypes, the formal def refers to a physical or process entity. If by 'HPO cancer terms refer to the tumor' we mean that the entity each formal definitions refers to is a tumour, and the entity in the QE def is an NCIT term, doesn't that mean the NCIT term is being used to refer to a tumour and not a disease?

pnrobinson commented 5 years ago

I agree. At this point, I would suggest xref'ing the NCIT terms. Until we have clarified the intended meanings at NCIT it does not make sense to make logical definitions.

drseb commented 5 years ago

Why exactly is not everything under "Neoplasm by Morphology" assumed about the tumor? Can't we just import this subset?

cmungall commented 5 years ago

Why exactly is not everything under "Neoplasm by Morphology" assumed about the tumor? Can't we just import this subset?

Not totally sure what you mean and if I'm answering your question but "neoplasm by morphology" is more of a metaclass, is just a way of organizing some top level terms.

cmungall commented 5 years ago

Ignoring NCIT for a moment, I just want to try and make sure we're on the same page about HPO

Seb:

I think what is meant by "refer to the tumor" is that we want to say that the HPO terms are defined as an "abnormally increased number" of "the tumor" and that would still be a phenotype.

I think this is reasonable, but my interpretation of what Peter is saying is that the HPO class denotes the physical tumor itself, rather than the property of incidence of the tumor. If this is the case, then this will require some changes in the OWL patterns and our overall approach.

pnrobinson commented 5 years ago

Actually, I was not talking about the OWL patterns and would defer to the various OWL gurus on our team about that. I was talking about the way the terms are to be interpreted from an analytical/medical point of view. However, given the discussions above, perhaps it is necessary to formally write down what we think a phenotype is or not, as there does seem to be conceptual divergency within our group (e.g., whether a physical entity can be a phenotype).

drseb commented 5 years ago

To be honest, I don't understand the discussion completely. I think we agree, that

the physical entity is the tumor
the finding of said physical entity is the phenotype we represent in HPO (i.e. something like 'increased occurrence' inheres_in 'the tumor')

I may miss some important points here!? Please help

cmungall commented 5 years ago

Seb, I don't think you're misunderstanding. I think what you have stated is valid. I want to make sure we are all agreed. Note that a corollary: here, the the phenotype is not the physical entity.

pnrobinson commented 5 years ago

Chris, are you saying that the phenotype is not a physical entity because that is the convention we chose for OWL or are you saying that you truly do not think a phenotype can be a physical entity? Please explain!

cmungall commented 5 years ago

The ontology should reflect our model of the world. If we think A isa B, and the ontology says A disjointfrom B, then the ontology is broken and we should fix it!

pnrobinson commented 5 years ago

That is a relatively extreme view ... but if I understand correctly, you are basically saying that our model is inaccurate. My model of the world is definitely that phenotypes can be physical entities, because that matches with what my senses are telling me. Is your model of the world, forgetting OWL for a second if possible, different?

cmungall commented 5 years ago

I'm not sure it's an extreme view...

My model of the world is definitely that phenotypes can be physical entities, because that matches with what my senses are telling me Is your model of the world, forgetting OWL for a second if possible, different?

In my simplistic view of the world there are physical things, processes, and properties of both of those things. I accept that language is fuzzy and never has precise mappings to models, but I have always modeled "phenotype (sensu OBO)" exclusively as the properties and not to the physical things (or the processes). Thus redness of eye is a "phenotype", but not an eye that is red; or presence or frequency of a tumor, but not the tumor itself.

(Of course, in everyday conversation that level of distinction is annoying and pedantic, and it's fine to talk of the eye or the tumor as being the phenotype, we just have a commonly understood mapping from language to the model)

I'm happy to entertain other ways of modeling things, but I think if the computable model diverges from the expert mental model (and presumably, the real world) it's setting us up for difficulties and confusion.

pnrobinson commented 5 years ago

Well, outside of the needs of our particular model, I do not see any pressing philosophical reason to consider the presence of a thing to be the phenotype rather than the phenotype itself. I would say that our logical definitions diverge relatively often from what I would consider to be the essence of the phenotype we are trying to describe -- but I was willing to chalk it up to no model being good, but some being useful. Please understand -- I do not think we need to change course at all and am trying to figure out where you are coming from here....

matentzn commented 5 years ago

:) Okay, I would like to suggest the following. 1) There is no need to worry, neither from the semantics nor the medical people. I agree 100% that we are building models, and models are != the reality and reflect our mental model only to a degree. There is no doubt that OWL definitions are simplifications of reality to achieve a number of desirable purposes: cross-species reasoning, post composed querying, quality control and classification. What we are doing is dancing together to find the sweet spot between medical intuition and technical utility. 2) It is the semantics folks jobs to make sure that the medical people actually say what they intended to say (including reviewing logical implications), but re-reading this threat here, I feel that we are haggling too much on the conceptual side of things. It is a fact of life that NCIt terms, coming from a terminology AND an ontology, are used to refer to physical entities and diseases. The correct way to deal with this ambiguity would be to create a new ontology that shadows the NCIt 1:1 and xref over, and then just declare the ontological commitment we want to make in the shadow ontology. But for now, that is overkill, and a maintenance nightmare. I suggest very simply to focus on harmonising the use of reference ontologies everywhere (i.e. everything from cancer comes from NCIt), and then selectively remove logical axioms that those reference ontologies make that cause unsatisfiable classes (aka some innocents hacks). 3) The discussion whether a phenotype is a thing, the occurence/presence of a thing, the observation of a thing, and so on is interesting, but will freeze us in place. Not even medics will agree what they think when they read HP:001 'Abnormality of the gut'. I dont think right now we need that level of distinction, and for many phenotypes, we never will. So, from 10 years of literature I thought we had agreed on a technical definition (not medical!) that is good enough not to upset medics and just clear enough not to upset semantics: a phenotype is defined as something (note that currently, using the has-part buffer, we are not saying what it IS):

that is somehow associated with an organism (obviously we are not talking about the weather)
that has an (abnormal) quality
- which is associated with a material (anatomical, pathological, molecular, chemical...) entity or a biological process

.. and defer any question of whether a phenotype can be interpreted as a thing, and observation, a characteristic or an occurrence to a later stage. That way, to come back to this thread, we can for now focus on harmonising the use of vocabulary and at a later stage decide whether we don't like the logical implications and switch to something else, or hack. Note that I am not saying we can indefinitely defer the question of what the nature of the referred entity (anatomy, process etc) is: that is obviously very important. I just think we are en route to something really cool and unprecedented, which we should not compromise with disussions about who should interpret HP:001 how.

To come back to the thread: can we assume for now that

tumors are physical entities (which are analogous to abnormal tissue, but obviously pathological), about which we can observe characteristics such as size, weight, and so on,
there will are some cancerous processes (metastasis it is called?) that are disjoint from tumors which can have increased or decreased rates of growth, degree of spread, etc,
and for every phenotype term in HPO or otherwise, we just, as usually, decide, which of the two fits better?

So for above DO references, we replace them by simple references to neoplasms wherever an HP class (HP_0011459 (Esophageal carcinoma) -> NCIT_C3513).

And then focus the debate of the whole community on the question whether:

has part some (
    increased amount and 
    inheres in some NEOPLASM and 
    has modifier some abnormal)

is really the best way to handle the neoplasm phenotype (not sure whether increased size might be better).

If you still feel the conversation of what is a phenotype is important for some reason (precisely: if we assume the above conceptualisation of a phenotype, why does it matter what a medic thinks when they see HP:001?), maybe it is better to make this a structured debate in a telecon.

pnrobinson commented 5 years ago

Actually for now I would propose making not definitions but xrefs to the NCIT until we can settled the issue of whether their terms are intended to be diseases, tumors, or both.

matentzn commented 5 years ago

Ok it seems we should get on the same page first :) I will try and collect some positions to answer the NCIt question from the community over the next weeks, including someone from NCIt. After that, I will schedule a telecon, where we can discuss pros and cons one way or another. We are not in a hurry regarding this issue. Thanks for all the input!

pnrobinson commented 5 years ago

The reason I think our current logical definitions are wrong is the following. If we think that NCIT refers to the disease and HPO refers to the phenotype, then

'has part' some 
    ('increased amount'
     and ('inheres in' some 
        (carcinoma
         and ('part of' some esophagus)))
     and ('has modifier' some abnormal))

this means we have an increased amount of a cancer-disease in the esophagus, which does not make sense. For instance, carcinoma might include weight loss. I think that these definitions might make sense with MPATH terms for carcinoma and the major categories. But not with NCIT. I think the NCITs should be listed as xrefs.

drseb commented 5 years ago

Ok. Then this is decided. We put NCIT in xrefs!! Hooray!

Re logical defs: there is some new developments. Need workshop/further discussions with MODs!

pnrobinson commented 5 years ago

I have removed all the DO references and added appropriate NCIT cross-refs.

drseb commented 5 years ago

Hooray. Thanks.

ping @matentzn

obophenotype / human-phenotype-ontology

Replace the DOID references in EQ definitions #4281