obophenotype / cell-ontology

An ontology of cell types
https://obophenotype.github.io/cell-ontology/
Creative Commons Attribution 4.0 International
144 stars 49 forks source link

immune slim revision #1941

Closed dosumis closed 1 year ago

dosumis commented 1 year ago

The first round of generating the blood_and_immune_upper_slim slim was done without the help of reporting tools. Now those tools are in place, I think we can see some room for improvement, and some ways we could write a better SOP.

The main use case for an upper slim is to summarise data. Things that potentially get in the way of this use case:

  1. Classes in the slim with very small numbers of subclasses: Having 0-1 subclass => no capacity to summarise.
  2. Classes with disproportionately large numbers of subclasses.
  3. Overlapping classes - these clash with some types of summary - e.g. pie charts.

Potential clashing concern: It seems reasonable to want to make sure that very important cell types are not obscured in generating summaries, but this desire can clash with the considerations above. Some judgement may be needed.

Report for current slim

#####Coverage percentage#####
97.76%
#####Number of terms covered by each term in the slim#####
platelet,1
multinucleated giant cell,1
nucleated thrombocyte,1
natural helper lymphocyte,1
B-lymphoblast,1
lymphoblast,2
blood lymphocyte,2
mast cell,3
proerythroblast,3
immature B cell,3
myeloid suppressor cell,3
erythroid progenitor cell,4
megakaryocyte,4
nongranular leukocyte,4
reticulocyte,5
hematopoietic multipotent progenitor cell,5
basophil,6
erythrocyte,7
eosinophil,7
neutrophil,7
erythroblast,10
hematopoietic stem cell,12
plasmacytoid dendritic cell,12
plasma cell,13
monocyte,20
hematopoietic oligopotent progenitor cell,22
gamma-delta T cell,23
innate lymphoid cell,46
immature T cell,51
macrophage,55
conventional dendritic cell,61
hematopoietic lineage restricted progenitor cell,64
B cell,94
alpha-beta T cell,94
professional antigen presenting cell,199
mononuclear cell,422
#####Terms that are not covered by src/templates/blood_and_immune_upper_slim.csv under CL:0000988#####
http://purl.obolibrary.org/obo/CL_0002318,peripheral blood mesothelial cell
http://purl.obolibrary.org/obo/CL_0002355,primitive red blood cell
http://purl.obolibrary.org/obo/CL_0002356,primitive reticulocyte
http://purl.obolibrary.org/obo/CL_0002358,pyrenocyte
http://purl.obolibrary.org/obo/CL_0002361,primitive erythroid progenitor
http://purl.obolibrary.org/obo/CL_0000092,osteoclast
http://purl.obolibrary.org/obo/CL_0000385,prohemocyte (sensu Nematoda and Protostomia)
http://purl.obolibrary.org/obo/CL_0000390,blood cell (sensu Nematoda and Protostomia)
http://purl.obolibrary.org/obo/CL_0000588,odontoclast
http://purl.obolibrary.org/obo/CL_0000779,multinuclear osteoclast
http://purl.obolibrary.org/obo/CL_0000780,multinuclear odontoclast
http://purl.obolibrary.org/obo/CL_1001610,bone marrow hematopoietic cell
http://purl.obolibrary.org/obo/CL_2000074,splenocyte
http://purl.obolibrary.org/obo/CL_0002417,primitive erythroid lineage cell

Overlap report (generated by Anita

sub obj sub_label obj_label
http://purl.obolibrary.org/obo/CL_0000784 http://purl.obolibrary.org/obo/CL_0000145 plasmacytoid dendritic cell professional antigen presenting cell
http://purl.obolibrary.org/obo/CL_0000786 http://purl.obolibrary.org/obo/CL_0000236 plasma cell B cell
http://purl.obolibrary.org/obo/CL_0000784 http://purl.obolibrary.org/obo/CL_0000842 plasmacytoid dendritic cell mononuclear cell
http://purl.obolibrary.org/obo/CL_0000786 http://purl.obolibrary.org/obo/CL_0000842 plasma cell mononuclear cell
http://purl.obolibrary.org/obo/CL_0000038 http://purl.obolibrary.org/obo/CL_0002031 erythroid progenitor cell hematopoietic lineage restricted progenitor cell
http://purl.obolibrary.org/obo/CL_0000235 http://purl.obolibrary.org/obo/CL_0000145 macrophage professional antigen presenting cell
http://purl.obolibrary.org/obo/CL_0000236 http://purl.obolibrary.org/obo/CL_0000842 B cell mononuclear cell
http://purl.obolibrary.org/obo/CL_0000990 http://purl.obolibrary.org/obo/CL_0000145 conventional dendritic cell professional antigen presenting cell
http://purl.obolibrary.org/obo/CL_0000816 http://purl.obolibrary.org/obo/CL_0000236 immature B cell B cell
http://purl.obolibrary.org/obo/CL_0000576 http://purl.obolibrary.org/obo/CL_0000842 monocyte mononuclear cell
http://purl.obolibrary.org/obo/CL_0000789 http://purl.obolibrary.org/obo/CL_0000842 alpha-beta T cell mononuclear cell
http://purl.obolibrary.org/obo/CL_0000798 http://purl.obolibrary.org/obo/CL_0000842 gamma-delta T cell mononuclear cell
http://purl.obolibrary.org/obo/CL_0000816 http://purl.obolibrary.org/obo/CL_0000842 immature B cell mononuclear cell
http://purl.obolibrary.org/obo/CL_0000990 http://purl.obolibrary.org/obo/CL_0000842 conventional dendritic cell mononuclear cell
http://purl.obolibrary.org/obo/CL_0001065 http://purl.obolibrary.org/obo/CL_0000842 innate lymphoid cell mononuclear cell
http://purl.obolibrary.org/obo/CL_0017006 http://purl.obolibrary.org/obo/CL_0000842 B-lymphoblast mononuclear cell
http://purl.obolibrary.org/obo/CL_4030029 http://purl.obolibrary.org/obo/CL_0000842 blood lymphocyte mononuclear cell
http://purl.obolibrary.org/obo/CL_0002420 http://purl.obolibrary.org/obo/CL_0000842 immature T cell mononuclear cell
http://purl.obolibrary.org/obo/CL_0002679 http://purl.obolibrary.org/obo/CL_0000842 natural helper lymphocyte mononuclear cell
http://purl.obolibrary.org/obo/CL_0017005 http://purl.obolibrary.org/obo/CL_0000842 lymphoblast mononuclear cell
http://purl.obolibrary.org/obo/CL_0017006 http://purl.obolibrary.org/obo/CL_0017005 B-lymphoblast lymphoblast

query: https://api.triplydb.com/s/_b0O6UP2A

SOme conclusions:

These do not group (a score of 1 = no subclasses)

platelet,1 multinucleated giant cell,1 nucleated thrombocyte,1 natural helper lymphocyte,1 B-lymphoblast,1 lymphoblast,2 blood lymphocyte,2

Mononuclear cell has >400 subclasses and is a source of much of the overlap. Using OLS with coverage (subclass) counts displayed shows some good possibilities for choosing more specific terms.

image

Judgement required. There are no perfect solutions - but some are better than others.

ghost commented 1 year ago

@dosumis, please describe an action item and acceptance criteria for this ticket. As it is, I'm unsure how to proceed.

dosumis commented 1 year ago

Use your judgement - given the specified use-case.

dosumis commented 1 year ago

Also see SOP ticket. https://github.com/obophenotype/cell-ontology/issues/1919

ghost commented 1 year ago

Use your judgement

Thanks for the link.

My pending question is: Use judgement to do what exactly? I would need some acceptance criteria to know what the revised intended goal is.

The originally submitted list of classes (before the removal of overlaps) was my determination of a balance between overlap and specificity. If you can specify metrics that would constitute an improvement on that list, I can re-review the original list against those clarified metrics.

Dropping all classes with score of 1 and replacing with the parent class seems like a large gap in granularity in some cases, e.g. platelet and 'myeloid cell'. Can you provide a metric on how to determine which of those two terms would be appropriate?

Happy to discuss offline and record the clarified action item and acceptance criteria in this ticket.

dosumis commented 1 year ago

before the removal of overlaps

I don't think overlaps have been removed.

ghost commented 1 year ago

before the removal of overlaps

I don't think overlaps have been removed.

@anitacaron, can you confirm the blood_and_immune_upper_slim overlap terms will be removed once #1939 is merged? Based on this comment, it seems like they will be, but can you confirm?

anitacaron commented 1 year ago

@bvarner-ebi, @ubyndr put them back on this commit and removed the QC for overlapping classes, requested by David offline.

ghost commented 1 year ago

All action items appear to be addressed. If anything else is required, kindly reopen with required action items.