Open dosumis opened 3 years ago
Agreed, but I have a simpler proposal:
these CP terms are used in axioms that are either rococco axioms that serve no purpose, or in overstated logical definitions that do not match the text def:
These clearly violate S11 in the SRS guidelines https://douroucouli.wordpress.com/2019/07/08/ontotip-write-simple-concise-clear-operational-textual-definitions/
We should simply turn these equivalence axioms into subclass axioms, and toss out any that are useless - e.g. cell phenotypes. These can be turned into textual comments.
Note: whenever we do address this problem, note that we have pseudo-CPs with UUIDs, see linked ticket
The granulocyte hierarchy is problematic in a number of ways, and needs some attention. The nuclei shapes are important markers of the differentiation status of the various granulocytes, and combined with staining, are part of the histological definition of general types (neutrophil, eosinophil, basophil) and their differentiation stage (for at least human granulocytes, with some similarities in mouse as well, PMID:25926395 and even zebrafish, PMID:23463724), which is why the shapes are part of the definitions.
Fixing the granulocyte hierarchy will take a chunk of curation time. Certainly, better definitions and handling of markers and capabilities is important and human subclasses are needed that include reference to defining markers used in scRNA seq and modern flow cytometry/CyTOF.
Hihi, opening this up again as CP terms are coming back to haunt me lol. Is there any movement on how to handle this? Is the plan still to remove all ex CP terms and retrofit all the stuff that uses them with CC+PATO or we decided to let CP terms sneaky be in CL lol. Thanks :)
When I look at CL-edit.owl in Protege, it looks like all the CP: terms are obsoleted, so I am confused by your comment.
We changed them to the CL namespace. In a minority of cases so far we have switched to using the nested pattern with PATO. We could work to complete this, however, I now think this is not ideal - as these axioms are invisible to knowledge graphs & to UberGraph. We are working on a KG linked to annotation of a very large and growing corpus of single cell transcriptomes (>10^8) annotated with CL. There are interesting possibilities for using this to look for transcriptomic correlates of cell properties (e.g presence of a lobed nucleus - one of the original recognised markers of basophils). We can't to this if these properties are invisible to the KGs.
@shawntanzk is the problem that your pipelines assume that everything with a CL ID must be a cell? Can't you use the graph to work that out?
@shawntanzk is the problem that your pipelines assume that everything with a CL ID must be a cell? Can't you use the graph to work that out?
yeah basically this, its super easily fixed on our end where we can just take children of cell, and I can't reveal too much yet (not cause its anything REALLY secretive but we dont want to figure out with legal people what we can and cannot say atm esp since we will release all this stuff in a preprint soon) but basically we are deciding how to handle this in our in-house models which assumes everything is a cell or something. We can just do what CL does and append to the cellular component in GO branch, but also, its not a simple cellular component cause it involves phenotype too (though I get that there's no actual issue with just doing that) ANYWAY, mainly wanted to know how CL was dealing with it so I can plan accordingly and "future proof" my stuff :D
I think always safer to rely on semantics over namespace - or at the very least to combine them. Isn't that a major reason for most of what we do?
See https://github.com/obophenotype/cell-ontology/issues/572#issuecomment-762830332
This is dependent on all relevant terms being added to PATO.