Closed cstrlln closed 6 months ago
Hello!
please note that ScoreSignatures_UCell()
by default adds a suffix to the signature names to identify the UCell gene set scores (name
parameter). You can set this parameter to NULL, if you want to prevent adding the suffix. In your code, it should be sufficient to modify the call as follows:
scores2 <- UCell::ScoreSignatures_UCell(scores,
features=c5_bp_gene_sets_list ,
precalc.ranks = ranks,
ncores = 5,
assay = 'logcounts',
name = NULL)
In the same way, SmoothKNN
will add the "_kNN" suffix to the knn-smoothed signatures to identify the smoothed scores - keep that in mind when retrieving the signatures from the resulting object. You can refer to Using UCell with SingleCellExperiment for some examples.
Best -m
Ah, the suffix, missed that. Thanks!
Could you still give an example on how to use sce.assay and sce.expname properly?
Yes, I'll take the example that comes up with ?SmoothKNN
:
library(UCell)
library(SingleCellExperiment)
library(scater)
data(sample.matrix)
sce <- SingleCellExperiment(list(counts=sample.matrix))
gene.sets <- list( Tcell = c("CD2","CD3E","CD3D"),
Myeloid = c("SPI1","FCER1G","CSF1R"))
# Calculate UCell scores
sce <- ScoreSignatures_UCell(sce, features=gene.sets, name=NULL)
# Run PCA
sce <- logNormCounts(sce)
sce <- runPCA(sce, scale=TRUE, ncomponents=20)
# Smooth signatures
sce <- SmoothKNN(sce, reduction="PCA", signature.names=names(gene.sets), sce.expname = "UCell")
sce
class: SingleCellExperiment
dim: 20729 600
metadata(0):
assays(2): counts logcounts
rownames(20729): AL627309.1 AL669831.5 ... AP001468.1 AP001469.2
rowData names(0):
colnames(600): L5_ATTTCTGAGGTCGTGA L4_TCACTATTCATCTCTA ... E2L4_TCGGGCACAAGTCCAT E2L8_CGTCCATCACCACATA
colData names(1): sizeFactor
reducedDimNames(1): PCA
mainExpName: NULL
altExpNames(2): UCell UCell_kNN
There are two altExp: UCell and UCell_kNN, generated respectively by signature scoring and subsequent smoothing of the signatures. What if, instead, you wanted to smooth directly the gene log-counts from the 'main' experiment? You could do:
sce <- SmoothKNN(sce, reduction="PCA", signature.names=c("CD4","CD8A"),
sce.expname = "main", sce.assay = "logcounts")
sce
class: SingleCellExperiment
dim: 20729 600
metadata(0):
assays(2): counts logcounts
rownames(20729): AL627309.1 AL669831.5 ... AP001468.1 AP001469.2
rowData names(0):
colnames(600): L5_ATTTCTGAGGTCGTGA L4_TCACTATTCATCTCTA ... E2L4_TCGGGCACAAGTCCAT E2L8_CGTCCATCACCACATA
colData names(1): sizeFactor
reducedDimNames(1): PCA
mainExpName: NULL
altExpNames(3): UCell UCell_kNN main_kNN
Note that this command generated a new altExp ('main_kNN') with the smoothed gene log-counts. This can then be accessed by altExp(sce, "main_kNN")
.
I hope that's useful.
This makes it clear, thanks!
I'm trying to use the SmoothKNN function with an SCE object that contains the UCell in the altExp slot, not sure how to properly compose the function in terms of sce.assay and sce.expname. Below is the code and the error I get.
`c5_bp_gene_sets = msigdbr(species = "human", category = "C5", subcategory = "GO:BP") head(c5_bp_gene_sets)
c5_bp_gene_sets_list = split(x = c5_bp_gene_sets $gene_symbol, f = c5_bp_gene_sets $gs_name)
set.seed(0101001001) ranks <- UCell::StoreRankings_UCell(scores, assay = 'logcounts', maxRank = 2000,ncores = 4)
scores2 <- UCell::ScoreSignatures_UCell(scores, features=c5_bp_gene_sets_list , precalc.ranks = ranks, ncores = 5, assay = 'logcounts')
scores2_smooth <- UCell::SmoothKNN(scores2, signature.names = names(c5_bp_gene_sets_list), reduction="PCA", sce.expname = c("UCell"))
` Error in SmoothKNN.SingleCellExperiment(scores2, signature.names = names(c5_bp_gene_sets_list), : Could not find any of the given signatures in this object