Closed a00101 closed 2 years ago
Hi, there is a function in TCGAbiolinks
called TCGAanalyze_Stemness
https://rdrr.io/bioc/TCGAbiolinks/man/TCGAanalyze_Stemness.html
But here is the code for stemless score (you can adapt it) and data (I added it in https://github.com/BioinformaticsFMRP/PanCanStem_Web/tree/master/Stemsig):
signature <- readr::read_tsv(
"https://raw.githubusercontent.com/BioinformaticsFMRP/PanCanStem_Web/master/Stemsig/SC-pcbc-stemsig.tsv",
col_names = F
)
signature.weight.vector <- signature$X2
names(signature.weight.vector) <- signature$X1
# Just an example with correlation 1 and -1
gene.expression.matrix <- matrix(signature$X2)
rownames(gene.expression.matrix) <- signature$X1
gene.expression.matrix <- cbind(gene.expression.matrix,gene.expression.matrix * -1)
calculate_score <- function(signature.weight.vector, gene.expression.matrix){
# Keep only common genes
common.genes <- intersect(names(signature.weight.vector), rownames(gene.expression.matrix))
gene.expression.matrix <- gene.expression.matrix[common.genes, ,drop = FALSE]
signature.weight.vector <- signature.weight.vector[common.genes]
score <- apply(gene.expression.matrix, 2, function(sample) {
cor(sample, signature.weight.vector, method = "sp", use = "complete.obs")
})
print(paste0("Min score: ",min(score)))
print(paste0("Max score: ",max(score)))
# Scale the scores to be between 0 and 1
print(paste0("Normalized scores to be between 0 and 1"))
score <- score - min(score)
score.normalized <- score/max(score)
print(paste0("Min normalized score: ",min(score.normalized)))
print(paste0("Max normalized score: ",max(score.normalized)))
return(score.normalized)
}
calculate_score(signature.weight.vector,gene.expression.matrix)
which data expression type should be using when i calculate stemness by TCGAanalyze_Stemness, count ,tpm or fpkm. the pipline in synapse is using rpkm, if I understand you correctly.
We added it in TCGAbiolinks: https://bioconductor.org/packages/release/bioc/vignettes/TCGAbiolinks/inst/doc/stemness_score.html Yes, they used RPKM aligned to hg19, but I would not expect an impact in the results using TPM or FPKM aligned to hg38. But they would need to check it.
On Fri, Jul 22, 2022 at 7:45 AM Nuvolar @.***> wrote:
which data expression type should be using when i calculate stemness by TCGAanalyze_Stemness, count ,tpm or fpkm. the pipline in synapse is using rpkm, if I understand you correctly.
— Reply to this email directly, view it on GitHub https://github.com/BioinformaticsFMRP/PanCanStem_Web/issues/5#issuecomment-1192440764, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQ6KU3J4TRJ2VFN5PIWLVVJ3TFANCNFSM45IVHAOQ . You are receiving this because you modified the open/close state.Message ID: @.***>
I can't find the 'how to document' where can I find ?
Thanks.