theislab / sfaira

data and model repository for single-cell data
https://sfaira.readthedocs.io
BSD 3-Clause "New" or "Revised" License
134 stars 11 forks source link

Cell type matched wrongly #337

Closed Hrovatin closed 3 years ago

Hrovatin commented 3 years ago

When loading sfaira (human pancreas) I have noticed some cell types are matched to ontology wrongly.

Both should be "pancreatic PP cell": gamma | ['pancreatic endocrine cell'] gamma cell | ['pancreatic endocrine cell']

davidsebfischer commented 3 years ago

Thanks @Hrovatin, do you know which data sets in particular?

davidsebfischer commented 3 years ago

I found:

davidsebfischer commented 3 years ago

new term: "pancreatic PP cell", "CL:0002275"

davidsebfischer commented 3 years ago

PP is a sub term of "pancreatic endocrine cell" (https://www.ebi.ac.uk/ols/ontologies/cl/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FCL_0008024&viewMode=PreferredRoots&siblings=false), I think at the time we did not know the exact term because gamma is not annotated in cellontology. Therefore, we chose the umbrella term.

Hrovatin commented 3 years ago

Yes, gamma was an old name for Pp cells, but it is still often used in literature. Maybe add this into ontology matching in general as it will likely be the same in some new datasets as well.

davidsebfischer commented 3 years ago

I won't hard-code it to the semi-automatic matching as this is technically a synonym of the cell type label, ie something that cell ontology takes care of managing and that we simply consume here.

davidsebfischer commented 3 years ago

We could ask cellontology to update it though!