obophenotype / cell-ontology

An ontology of cell types
https://obophenotype.github.io/cell-ontology/
Creative Commons Attribution 4.0 International
146 stars 49 forks source link

Obsolete all grouping terms "X cell, human" terms #1102

Open cmungall opened 3 years ago

cmungall commented 3 years ago

Explanation: this leads to the Ragged lattice issue

Example:

image

This is massively confusing to users. They open this node thinking they would find all mouse lymphocyes, but there is a handful

nicolevasilevsky commented 3 years ago

do you want to obsolete all of the human cell types only or all of the 'x cell, x species' classes?

I have no objections to this. We should wait a couple of weeks for feedback before proceeding, right?

addiehl commented 3 years ago

I'm very confused, as 'cell, human', 'cell, mouse', 'lymphocyte, mouse' do not appear in cl-edit.owl or via ontobee or OLS. We decided not to include these per agreement when we integrated xcl.owl into cl-edit.owl.

addiehl commented 3 years ago

Other cell types with the tags ', mouse' and ', human' match agreement with various discussions as to having a tag on the label of species specific cell type classes.

dosumis commented 3 years ago

We decided not to include these per agreement when we integrated xcl.owl into cl-edit.owl.

That's my recollection too. Where are these showing up?

addiehl commented 3 years ago

I can't find them in cl-edit.owl or cl-full.owl using BBEdit either.

cmungall commented 3 years ago

Here is "dendritic cell, human":

https://www.ebi.ac.uk/ols/ontologies/cl/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FCL_0001056

Other cell types with the tags ', mouse' and ', human' match agreement with various discussions as to having a tag on the label of species specific cell type classes.

This isn't about the naming convention (which should be encoded in a dosdp yaml file), it is about whether CL should include the pattern ", ". Apologies if this was previously discussed and you decided to include (I can never make the Wednesday CL calls), I would like the revisit this.

addiehl commented 3 years ago

Why is this wrong? HLA-DRA is a unique human version of an MHC Class II alpha chain. PRO is of course screwed up in its naming for MHC molecules since they are not generic but rather based on human genes for this very polymorphic group of genes, and actually, we could use 'HLA class II histocompatibility antigen, DR alpha chain (human)' (PR:P01903) instead to emphasize this is really a human cell type. I would note the parent "dendritic cell" has part 'MHC class II protein complex' (GO:0042613) in part to avoid reference to the confusingly named PRO terms for MHC complexes and chains.

The choice of adding ", human", ", mouse" was discussed on multiple occasions as a way to serve end users who might only be looking at the label, as well as the choice to use human or mouse instead of the Latin names. (We discussed this again on the CL call today).

cmungall commented 3 years ago

again, I cannot make the wednesday calls, so I cannot partake in these discussions

You are going to confuse CL users massively, they will look under "human dendritic cell" expecting to find a full classificiation of dendritic cells in human

If you really need a marker-based grouping then name it as such but I am not convinced of the use case for this grouping class

addiehl commented 3 years ago

I think with our intended revision of the dendritic cell hierarchy (working with Anna Maria and others), this term will serve as the parent for all the human DC cell types in CL.

dosumis commented 3 years ago

My attempt to summarise the issue:

dosumis commented 3 years ago

A possible compromise would be to have these classes outside of the main CL release - but that is likely to come with a different set of confusions.

cmungall commented 3 years ago

Re the compromise: I don't think they can have CL IDs if they are outside the main release, as they won't resolve.

I agree with your approach.

Note that we can easily have a robot materialize step as part of the creation of the view owl files. This would effectively turn the GCI into a standard axiom.

E.g.

(A and part-of some T) SubClassOf R some B
'anatomical entity' subClassOf part-of some T ## injected contextual axiom
A subClassOf* 'anatomical entity'
==>
A SubClassOf R some B
addiehl commented 3 years ago

If Chris objects to the example of 'dendritic cell, human' in the CL taxon paper , it should be noted that I put it there as an example of what is already present in CL at the time of writing. I still think it is a valid term, given the tie to a human specific marker.

However I could switch the example used in the paper to 'activated CD4-positive, alpha-beta T cell, human' (CL:0001043), which again uses the HLA-DRA as a human specific marker and is certainly a more granular cell type. Also mouse T cells do not express any form of MHC class II (unlike mouse dendritic cells), so this is certainly more species specific in that regards as well. This would be perhaps less controversial.

dosumis commented 3 years ago

Hi Alex,

Totally agree we should represent the biology, the only question is how best to do that in the ontology. Can you share your thoughts on the objection I voice above to 'dendritic cell, human':

Grouping classes for general cell types by species that bury species specific markers in the definition are more controversial as they are likely to cause confusion and will inevitably lead to incomplete grouping of annotations (arguably our core use case). In the case of "dendritic cell, human" there are currently only 2 subclasses, whereas 'dendritic cell' has 74 subClasses, most or all of which are applicable to human. Therefore 'dendritic cell, human' will fail to group that vast majority of human dendritic cell annotations.

Can you share any objections to the proposed solution?

One possible answer is to use GCI's: 'dendritic cell' and 'in taxon' some 'Homo sapiens' SubClassOf expresses some 'MHC class II histocompatibility antigen alpha chain DRA'. This records exactly what Alex needs (I think - we could discuss). If we follow this path it will be essential to have tooling in place to ensure that downstream users can leverage this. I think this but shouldn't be too hard to achieve, but would need some discussion around how CL is used by HIPC and others as a reference for markers.

addiehl commented 3 years ago

My concern with using GCI's is indeed whether they will allow downstream tools and less-ontology aware users to identify human DCs correctly.

Also, a number of granular DC class have comments like "These markers are associated with mouse cells." (for instance CL:0001002 & CL:0001020), others have comments like "Markers are found in human cells." (CL:0002394) or "Normally represent 65-75% of peripheral blood mDCs (human)." (CL:0002532). Many other DC terms have marker combinations that need to be reviewed to see if they are species-specific.

I think for DC terms, a general revision is required, and perhaps as part of that process we can decide whether a GCI approach makes more sense than simply relying on explicit presentation of necessary and sufficient markers and using 'in taxon' axioms and 'present in taxon' annotations to enable a species species view of human and mouse DCs, as well as more general DC types.

For the purposes of the paper, I would perhaps add some discussion of these ideas without committing to anything more than what is already written (and which various collections of people discussed a number of time and agreed to).

github-actions[bot] commented 2 years ago

This issue has not seen any activity in the past 6 months; it will be closed automatically in one year from now if no action is taken.

paolaroncaglia commented 2 years ago

As far as I'm aware, the general revision of DC terms that @addiehl advised above hasn't been carried out yet, so this ticket shouldn't be closed. Also, it's linked to an obophenotype project board.

github-actions[bot] commented 2 years ago

This issue has not seen any activity in the past 6 months; it will be closed automatically in one year from now if no action is taken.