grst / single_cell_data_integration

1 stars 0 forks source link

cell type annotations #12

Closed grst closed 5 years ago

grst commented 5 years ago

We need to assign a cell type to each single cell, in order to select interesting populations and to be able to test the data integration tools.

possible methodologies:

@Hoohm, @FFinotello, do you have further ideas how to do this properly?

FFinotello commented 5 years ago

Maybe scmap? But you need to build a reference first. https://www.nature.com/articles/nmeth.4644

grst commented 5 years ago

sounds interesting. That would basically make the batch effect removal unnecessary? That could work well with the 10x reference samples.

FFinotello commented 5 years ago

I would give it a try but I am not sure whether it is better to consider single data sets or the full compendium. It depends on the preprocessing steps performed internally by the tool. I should have a look at the code.

grst commented 5 years ago

New resource of (bulk) reference profiles: https://dice-database.org/downloads from https://www.cell.com/cell/fulltext/S0092-8674(18)31331-X

grst commented 5 years ago

Poster from ECCB about cell type classification 2085_001.pdf

grst commented 5 years ago

List of marker genes validated with co-expression in TCGA https://jitc.biomedcentral.com/track/pdf/10.1186/s40425-017-0215-8

grst commented 5 years ago

Moving forward, I see the following possibilities:

grst commented 5 years ago

On the other hand, when only interested in CD8+ T cells, it could be pragmatic to simply apply unsupervised clustering and extract all clusters that express CD8A/B for further analysis.

In that case all other cell types would not be annotated.

FFinotello commented 5 years ago

@grst, you may make some FPs due to dropout (false null expression of CD8A/B).

You could check this potential issue in the sorted CD8+ cells from https://www.nature.com/articles/ncomms14049

grst commented 5 years ago

Will build something along the lines of Schelker et al. now.

The approach is hierarchical:

Resources:

grst commented 5 years ago

Francesca pointed me to moana (https://www.biorxiv.org/content/biorxiv/early/2018/10/30/456129.full.pdf).

However, models are dataset-specific and training data has obtained from manual clustering (for which they also provide a framework).

==> I will stick to my marker-gene based approach.

(Could be interesting to make use of the kNN-smoothing though)

grst commented 5 years ago

For development of the classifier, see https://github.com/grst/single_cell_classification