cole-trapnell-lab / garnett

Automated cell type classification
MIT License
104 stars 25 forks source link

Creating a classifier using TPM values #19

Closed ccruizm closed 4 years ago

ccruizm commented 5 years ago

Good day!

I am really excited to implement your tool to classify my new single cell data (10x). In order to do so, I have dowloaded an expression matrix (scRNA using 10x as well), however the values were stored in TPM units and not raw counts. I am using Monocle2 to create the CDS object. However, I am having some issues converting TPM to RNA counts. Would be possible to use this matrix based on TPM values? or do I definitely need to convert the values first to RNA counts? This is the code I would use in case I can use TPM to train the classifier:

cds <- newCellDataSet(as.matrix(my_TPM_expression_matrix),
                       phenoData = pd,
                       featureData = fd,
                       lowerDetectionLimit = 0.1,
                       expressionFamily = tobit(Lower = 0.1))

In addition, if I can use this CDS object based on TPM, do I need to transform my own data to TPM as well? or can I use the raw counts (obtained from the cell ranger pipeline) as input for the classification?

Thank you in advance for your help.

hpliner commented 5 years ago

Hello,

We have not done any robust testing of classification using TPM values or similar (though some anecdotal examples have worked ok) - I suspect that normalization may end up being an issue because our normalization was designed specifically for UMI count data. You could give it a try, or you could simply both train and classify on your new data that are count based.

hpliner commented 4 years ago

Closing from lack of response - if you have further problems, please open a new issue.