Closed xlancelottx closed 2 years ago
No idea; never seen this before. If you can make a minimal reproducible example, I'm happy to take a look.
Sure, used one of the data sets you use in this scran tutorial
# loading libraries
library(scRNAseq)
library(scuttle)
library(scran)
# loading data and minimal preprocessing
sce <- GrunPancreasData()
qcstats <- perCellQCMetrics(sce)
qcfilter <- quickPerCellQC(qcstats, percent_subsets="altexps_ERCC_percent")
sce <- sce[,!qcfilter$discard]
# clustering
clusters <- quickCluster(sce)
# checking which labels are available
levels(clusters)
# Returns: '1''2''3''4''5''6''7'
# calculating size factors
sce <- computeSumFactors(sce, clusters=clusters, ref.clust = '2')
# Throws the error above
Additional note: Based on source code in scuttle pooledSizeFactors.R line 415 , ref.clust needs to be a character, so that it is considered. And just after trying to understand some of the code, could it be that the names of clust.profile get lost?
Update:
Tried out a quick and dirty fix using assignInNamespace(x=".calculate_pooled_factors", value=fixed_function, ns="scuttle")
and defined the ref cluster as the index in an earlier object, that still has names (indices). As far as I understood, that also happens when no ref.clust is defined. Seems to work by adding a statement to define the ref cluster index instead of the name and feeding that to .rescale_clusters. So changed:
if (is.null(ref.clust)) {
non.zeroes <- vapply(clust.profile, FUN = function(x) sum(x >
0), FUN.VALUE = 0L)
ref.clust <- which.max(non.zeroes)
print(paste("No ref.clust defined, automatically chosen ref cluster index:", ref.clust))
}
to
if (is.null(ref.clust)) {
non.zeroes <- vapply(clust.profile, FUN = function(x) sum(x >
0), FUN.VALUE = 0L)
ref.clust <- which.max(non.zeroes)
print(paste("No ref.clust defined, automatically chosen ref cluster index:", ref.clust))
} else {
ref.clust <- which(names(indices)==ref.clust)
print(paste("ref.clust defined, using ref cluster index:", ref.clust))
if (length(ref.clust)==0L) {
stop("'ref.clust' not in 'clusters', check if any cluster is larger than max.cluster.size and append ref.clust name with '-1'")
}
}
That one also returns a somewhat informative error if any cluster is larger than max.cluster.size, since then the '-1' tag needs to be added to the ref.clust
Haven't checked the above super carefully, so would be great, if you could double-check, that this approach is correct. But hope it helps for fixing it (probably much more elegant in a different way) :)
Should be fixed by the latest commit in scuttle. The immediate workaround is to set ref.clust
to the integer code corresponding to ref
.
That's great, thanks a lot!
Hi scran team,
When calculating size factors with a predefined reference cluster, I keep getting an error that my reference cluster is not part of the clusters, even though I made sure it is part of the input_clusters.
Relevant output: class: SingleCellExperiment dim: 24737 35680
Relevant output: length: 35680 Levels: '0''1''2''3''4''5''6''7''8''9''10''11''12''13''14''15''16''17''18''ref'
Error message: Error in .rescale_clusters(clust.profile, ref.col = ref.clust, min.mean = min.mean): 'ref.clust' not in 'clusters' Traceback:
Used versions R version 4.1.3, scuttle_1.4.0, scran_1.22.1
Also checked, that no cluster is > 5000 cells, so not "-1" tag should be added to any cluster labels. Out of interest, would it then be necessary to add the "-1" extension to the ref-cluster name?
Did you experience this before? Any idea where this could come from?
Thanks and best regards!