obophenotype / cell-ontology

An ontology of cell types
https://obophenotype.github.io/cell-ontology/
Creative Commons Attribution 4.0 International
146 stars 49 forks source link

Question: Cell Type annotated on multiple sites #2783

Open MartaBenegas opened 6 days ago

MartaBenegas commented 6 days ago

Hi,

I was exploring the Cell Type Ontology through the OLS website, so I searched "B Cell" and obtained this entry: https://www.ebi.ac.uk/ols4/ontologies/cl/classes/http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FCL_0000236

Where the cell type "B Cell" appears to be annotated multiple times in the tree: image

Shouldn't a term be annotated only once in an ontology? How is the ontology organized? Is this term assigned to multiple terms via different relationships?

Thanks, Marta.

balhoff commented 5 days ago

Hi @MartaBenegas I'm not a core contributor to CL but I can answer this. This is the same concept with the same identifier (http://purl.obolibrary.org/obo/CL_0000236) in all these places. It's just that there are multiple paths "upward" from that concept to the root 'cell' concept. The ontology is a directed graph, rather than a tree.

aleixpuigb commented 5 days ago

Hi @MartaBenegas, I will add a graph to illustrate the example you provided. Terms can have multiple parent terms, for example a leukocyte is a 'motile cell', an 'hematopoietic cell', an 'eukaryotic cell' and a 'nucleate cell'. As OLS only can show one path at a time, all paths are displayed separately.

image

Hope this help, and please let us know if it is not clear.

MartaBenegas commented 5 days ago

Hi @aleixpuigb, thank you for the graph! How have you obtained it?

And I'm still a bit confused about this classification.

From my point of view, the relations between the terms cell > single nucleate cell and leukocyte > lymphocyte are redundant. Since mononuclear cell is already a "Subclass of" a leukocyte, so it'll be their child (e.g. the lymphocyte term). Is this assumption not followed in this ontology?

On the other hand, aren't a nucleate cell and a hematopoietic cell always going to be an eukaryotic cell? Shouldn't they be a "Subclass of" eukaryotic cell?

aleixpuigb commented 5 days ago

Hi @aleixpuigb, thank you for the graph! How have you obtained it?

I have used protégé, a software to edit ontologies. There are other options, such as neo4j.

From my point of view, the relations between the terms cell > single nucleate cell and leukocyte > lymphocyte are redundant. Since mononuclear cell is already a "Subclass of" a leukocyte, so it'll be their child (e.g. the lymphocyte term). Is this assumption not followed in this ontology?

You are correct, the reason that it is showing is because I used the cl-base.owl file that contains all logical axioms (asserted and inferred). Therefore, it is showing relationships that editors have not added, but they are reasoned (e.g., any leukocyte that is mononucleate is a 'mononuclear cell'). This allows us to find relationships that we might have missed otherwise.

image

On the other hand, aren't a nucleate cell and a hematopoietic cell always going to be an eukaryotic cell? Shouldn't they be a "Subclass of" eukaryotic cell?

This is also true, those relations should be added to the ontology. I will open a ticket to find more eukaryotic cells that haven't been correctly classified. Thank you for pointing it out!

dosumis commented 5 days ago

I think it would be better to get rid of hard to support grouping classes - ones that cover most of the ontology. These are an unfortunate legacy from an earlier era of CL editing.

dosumis commented 5 days ago

Note - the Protege view is of the editors file, before redundancy stripping. Here is a non redundant view of the same hierarchy:

image

MartaBenegas commented 5 days ago

I thought the same about the term mononucleate cell there. It seems too general to be at that level of the graph.

I had further questions. While exploring the root term cell I've found quite specific cell types directly linked to it: (I pointed out some examples) image

What are the criteria for connecting them directly to the root cell term and not inside the corresponding lineage inside the eukaryotic cell term? Some children of those terms are placed inside eukaryotic cell as well (which may overcomplicate the ontology a bit): image