cole-trapnell-lab / garnett

Automated cell type classification
MIT License
107 stars 26 forks source link

Not enough training samples for any cell types at the root of cell type hierarchy #34

Closed saisomesh2594 closed 4 years ago

saisomesh2594 commented 4 years ago

Hi All,

I am facing an issue similar to #17 when trying to build a classifier for my data-set of immune cells derived from mouse pancreatic tissues. I have ~8000 cells divided into 14 clusters and the garnett marker file contains markers for each cluster. I have removed duplicate markers and then build a classifier as follows:

library(garnett)
library(org.Mm.eg.db)
mouse.classifier <- train_cell_classifier(cds = immune.cds,
marker_file = <path to garnett marker file>,
db= org.Mm.eg.db,
cds_gene_id_type = "ENSEMBL",
num_unknown = 50,
marker_file_gene_id_type = "SYMBOL")

Do you think it is because of low cell count ? I would assume to build a classifier, one would need quite a good number of cells!

Thanks, Somesh

hpliner commented 4 years ago

Hi Somesh,

In general, these types of issues show up when there's some issue with the gene conversion. What does your marker plot look like?

saisomesh2594 commented 4 years ago

Yikes!

I mistakenly set cds_gene_id_type = "ENSEMBL" instead of cds_gene_id_type = "SYMBOL".

Fixed It. Works perfectly.

Thanks, Somesh