Open lucygarner opened 3 years ago
If you write a function that accepts a log-normalized expression matrix (or a PC matrix, if transposed=TRUE
), then you should just be able to give it to FUN
and the rest will "just work". Assuming that your function doesn't have bugs, that is.
Note that scran::bootstrapCluster
is moving to bluster::bootstrapStability
in the upcoming release.
Thank you, I have written the following to extract the PC matrix from my Seurat object (named object
) and run clustering. Does this seem reasonable?
data.use <- Embeddings(object = object[["pca"]])
data.use <- data.use[, 1:30] # number of PCs to use
seurat_clustering <- function(PC_matrix) {
neighbor.graphs <- FindNeighbors(object = PC_matrix,
k.param = 20,
compute.SNN = TRUE,
prune.SNN = 1/15,
nn.method = "rann",
annoy.metric = "euclidean",
nn.eps = 0,
verbose = TRUE,
force.recalc = FALSE)
clustering.results <- FindClusters(object = neighbor.graphs$snn,
modularity.fxn = 1,
initial.membership = NULL,
node.sizes = NULL,
resolution = 0.5,
method = "matrix",
algorithm = 1,
n.start = 10,
n.iter = 10,
random.seed = 0,
group.singletons = TRUE,
temp.file.location = NULL,
edge.file.name = NULL,
verbose = TRUE)
return(clustering.results$res.0.5)
}
bootstrap_clusters <- bootstrapCluster(data.use, FUN = seurat_clustering, transposed = TRUE)
How is bluster::bootstrapStability
technically different to scran::bootstrapCluster
and would you recommend switching over straight away?
Many thanks, Lucy
Seems fine to me.
As for the difference: there is none, other than the name change and movement to a new package. Just moved some functions out of scran because the package was getting a bit too cluttered. The new package and function should be available when BioC 3.12 comes out tomorrow.
Great thanks. I tried the following, however I am getting strange results. I'm slightly confused as it seemed to be working yesterday!
bootstrap_clusters <- bluster::bootstrapStability(x = data.use, FUN = seurat_clustering, transposed = TRUE)
data.use is a matrix with cell IDs as row names and PCs and column names.
Any thoughts as to what might be happening here?
Best wishes, Lucy
Sorry, I just noticed this. Looks like you didn't set a seed, so that's probably why you got different results. I'm not quite sure why you got NA
's for some of the cluster pairs. It would seem like pairwiseRand()
is at fault but I can't see how that could happen.
I should add that your matrix already looks pretty strange. It is pretty surprising to have low diagonal entries along with high off-diagonal entries. If you can give me a reproducible example, I can try to debug it.
I think I got it:
pairwiseRand(factor(sample(3, 100, replace=TRUE), levels=1:8), sample(3, 100, replace=TRUE))
## 1 2 3 4 5 6 7 8
## 1 0.005751586 0.03516088 0.002100879 NaN NaN NaN NaN NaN
## 2 NA 0.01338677 -0.037901716 NaN NaN NaN NaN NaN
## 3 NA NA -0.018066332 NaN NaN NaN NaN NaN
## 4 NA NA NA NaN NaN NaN NaN NaN
## 5 NA NA NA NA NaN NaN NaN NaN
## 6 NA NA NA NA NA NaN NaN NaN
## 7 NA NA NA NA NA NA NaN NaN
## 8 NA NA NA NA NA NA NA NaN
Your clustering function may be returning unused levels - see how 11 is a level but is not used in the factor - causing the final matrix to have NA
s for all affected levels. Well, technically they're NaN
s, I don't know if your screenshot's matrix view distinguishes between the two. I can't think of any other way of generating pure NA
s - looking at the code, it shouldn't even be possible.
If this indeed the case, I'm guessing that Seurat is being a bit too smart for its own good when creating the cluster assignment. pairwiseRand()
will simply respect these unused levels, even though they are rather silly in this case.
Hi,
I would like to establish the stability of the clusters that I have identified using Seurat (FindNeighbors and FindClusters). I was wondering how difficult it would be for me to reimplement the
bootstrapCluster
approach for this? Would the best way for this be to just write out a function that performs Seurat-like clustering on log normalised counts and then input this intobootstrapCluster
? Would I include the PCA function in there as well?Many thanks, Lucy