obophenotype / cell-ontology

An ontology of cell types
https://obophenotype.github.io/cell-ontology/
Creative Commons Attribution 4.0 International
135 stars 49 forks source link

unify naming conventions and taxon axioms for species-specific cell types #2069

Open cmungall opened 12 months ago

cmungall commented 12 months ago

CL is a frankenstein ontology when it comes to taxon specificity:

  1. historic sensus, to genera and above; e.g CL:0000662 ! neuroglioblast (sensu Nematoda)
  2. sensus to 3 letter codes with expresses axioms; e.g: CL:0000754 ! type 2 cone bipolar cell (sensu Mus)
  3. 4 letter code with expresses axioms; e.g. CL:4023007 ! L2/3 bipolar vip GABAergic cortical interneuron (Mmus)
  4. common names with has plasma membrane part axioms; e.g. CL:0002426 ! CD11b-positive, CD27-positive natural killer cell, mouse

The axiomatization is also pretty uneven - here are sample edges:

subject subject_label predicate predicate_label object object_label
CL:0000754 type 2 cone bipolar cell (sensu Mus) RO:0002292 expresses PR:P50481 LIM/homeobox protein Lhx3 (mouse)
CL:0000754 type 2 cone bipolar cell (sensu Mus) RO:0002292 expresses PR:Q8R4I7 neuropilin and tolloid-like protein 1 (mouse)
CL:0000754 type 2 cone bipolar cell (sensu Mus) RO:0002292 expresses PR:Q9ER75 iroquois-class homeodomain protein IRX-6 (mouse)
CL:0002426 CD11b-positive, CD27-positive natural killer cell, mouse RO:0002104 has plasma membrane part PR:000001012 integrin alpha-M
CL:0002426 CD11b-positive, CD27-positive natural killer cell, mouse RO:0002104 has plasma membrane part PR:000001963 CD27 molecule
CL:4023007 L2/3 bipolar vip GABAergic cortical interneuron (Mmus) RO:0002292 expresses PR:P32648 VIP peptides (mouse)

Also note for these 2, there is no direct OR inferred taxon constraints:

CL:0000662 ! neuroglioblast (sensu Nematoda) CL:0000754 ! type 2 cone bipolar cell (sensu Mus)

Recommendations:

lubianat commented 12 months ago

I agree with @cmungall comments; I'd say it is also a good opportunity to standardize which taxa are in the scope of CL.

PRO is clear on species-neutrality x species-specificity; without taxonomic genera-level entries. It is always neutral or "organism". If we default to PRO conventions, we should avoid having "mouse" meaning both the genus Mus or the species Mus musculus.

Maybe CL should provide just species-neutral terms for metazoa and species-specific terms for Homo sapiens and Mus musculus, adopting PRO's "(human)" and "(mouse)" nomenclature. This seems to be generally the scope of CL already, it is just not specified anywhere.

Genera-and-above terms could be created on a case-by-case basis and have the taxonomic name under parenthesis, dropping the "sensu" word.

How hard would it be to have species-specific terms should be generated automatically by some pipeline parsing taxon constraints? This could be a way to enforce consistency.

cmungall commented 12 months ago

Like Uberon, the scope of CL is all metazoa (excepting the most general classes that can be reused across all of life) The focus is vertebrates, with particular emphasis on human and mouse. Even more so than Uberon, because the mouse ssAOs don't have cell types (well, MA, has two cell types, and EMAPA 10 cell types, which is odd).

Great idea to have an automated check for the naming conventions.

lubianat commented 11 months ago

Related issue:

github-actions[bot] commented 5 months ago

This issue has not seen any activity in the past 6 months; it will be closed automatically in one year from now if no action is taken.

cmungall commented 5 months ago

let's not close this