Closed slobentanzer closed 1 year ago
task 3: Cell annotation with cell ontology
Use marker resources, e.g. CellMarker2 Db, Human Protein Atlas to build a pipeline to annotate single cell data with cell ontology (here)
To develop the pipeline you can use toy data, e.g. the infamous pbmc data from 10x (example how to download and process with Seurat.) To handle single cell data in R, SingleCellExperiment or Seurat are standard packages; in python it is scanpy.
If the pipeline is built, try more challenging data sets from the heart:
From discussions on 13.04 improvement for marker selection: brainstorm about expression patterns you would expect of good cell type marker. Clustering resolution can be somewhat arbitrary, how does this effect cluster Cell ontology assignment? see https://www.nature.com/articles/s41467-021-21453-4 for example of identifiying stable cluster markers. Further, use HPA and or cellmarker2 to score marker themselves for celltype or tissue specificity (make this an option for the user to choose whether this is desired) improvement for CL suggestion: the overlap does not take into account the probability of finding an overlap by chance. Use hypergeometric test to assess significance of marker overlap. When weights are available these can be considered by more sophisticated models (see decoupleR package for collection of methods)
Marker2Cell works for annotating single cells based on their RNA expression profile.
Calling cell types, even previously unknown ones, using molecular markers.