Closed sneddonucsf closed 2 years ago
Hi @sneddonucsf,
This is a very good question and something I've been meaning to add to the package! My recommendation is to use the weighted average of the different pseudotime values, weighted by their assignment weights (slingCurveWeights
).
Here's a simple function that should calculate that vector for any time of Slingshot output (SCE, PseudotimeOrdering, or SlingshotDataSet):
slingAvgPseudotime <- function(x){
rowSums(slingPseudotime(x, na=FALSE) *
slingCurveWeights(x, as.probs = TRUE))
}
Best, Kelly
Thank you @kstreet13! I tried the function and got the error:
Error in .local(x, ...) : unused argument (as.probs = TRUE)
If I run: Endocrine.sce.slingshot
I get:
class: SingleCellExperiment
dim: 2000 14231
metadata(0):
assays(1): logcounts
rownames(2000): SST GCG ... CLIC5 FOXP3
rowData names(0):
colnames(14231): lane1_AAACGAAGTGTGACCC_1 lane1_AAAGGATTCGTACACA_1 ... TTTGTTGGTCAGATTC_9
TTTGTTGGTTATGTCG_9
colData names(15): orig.ident nCount_RNA ... slingPseudotime_3 slingPseudotime_4
reducedDimNames(2): PCA UMAP
spikeNames(0):
altExpNames(0):
and slingCurveWeights(Endocrine.sce_slingshot)
gives me:
curve1 curve2 curve3 curve4
lane1_AAACGAAGTGTGACCC_1 0.0000000 1.0000000 0.0000000 0.0000000
lane1_AAAGGATTCGTACACA_1 1.0000000 0.0000000 0.0000000 0.0000000
lane1_AAAGGGCCAGAACATA_1 0.0000000 1.0000000 0.0000000 0.0000000
lane1_AAAGGGCGTTGTTGTG_1 1.0000000 0.0000000 0.0000000 0.0000000
lane1_AAAGGGCTCAGGTAAA_1 1.0000000 0.0000000 0.0000000 0.0000000
lane1_AAAGGTAAGCGAGTAC_1 0.0000000 1.0000000 0.0000000 0.0000000
but slingCurveWeights(Endocrine.sce_slingshot, as.probs = TRUE)
gives me the above error.
Hm, that's strange. Just for my own curiosity, which version of the slingshot package do you have?
Anyway, you should be able to get around it with this:
slingAvgPseudotime <- function(x){
wts <- slingCurveWeights(x)
wts <- wts / rowSums(wts)
return(rowSums(slingPseudotime(x, na=FALSE) * wts))
}
Hi @kstreet13 I'm currently using version 1.4.0... I haven't worked up the courage to update R to version 4 yet for Slingshot 2.4.0!
I tried your second code as test = data.frame(slingAvgPseudotime(Endocrine.sce_slingshot))
and got:
slingAvgPseudotime.Endocrine.sce_slingshot.
lane1_AAACGAAGTGTGACCC_1 NA
lane1_AAAGGATTCGTACACA_1 NA
lane1_AAAGGGCCAGAACATA_1 NA
This occurs without the data.frame
part. Not sure why I would be getting NA's...
I went through each step of the function:
wts <- slingCurveWeights(Endocrine.sce_slingshot)
:
curve1 curve2 curve3 curve4
lane1_AAACGAAGTGTGACCC_1 0 1.0000000 0 0.0000000
lane1_AAAGGATTCGTACACA_1 1 0.0000000 0 0.0000000
lane1_AAAGGGCCAGAACATA_1 0 1.0000000 0 0.0000000
lane1_AAAGGGCGTTGTTGTG_1 1 0.0000000 0 0.0000000
lane1_AAAGGGCTCAGGTAAA_1 1 0.0000000 0 0.0000000
lane1_AAAGGTAAGCGAGTAC_1 0 1.0000000 0 0.0000000
lane1_AAAGTGAGTCACCCTT_1 1 0.0000000 0 0.0000000
lane1_AAAGTGAGTTGTCATG_1 1 0.0000000 0 0.0000000
lane1_AACAAGACAGCTATAC_1 1 0.0000000 0 0.0000000
lane1_AACAAGATCATTGCCC_1 1 0.0000000 0 0.0000000
lane1_AACAAGATCTGGACCG_1 1 0.5985513 1 0.6016412
lane1_AACACACGTCATCCCT_1 0 1.0000000 0 0.0000000
lane1_AACAGGGAGGTGCAGT_1 0 0.0000000 1 0.0000000
lane1_AACAGGGAGTCCTGTA_1 1 0.0000000 0 0.0000000
lane1_AACAGGGGTAGGTACG_1 1 0.0000000 0 0.0000000
wts <- wts / rowSums(wts)
:
curve1 curve2 curve3 curve4
lane1_AAACGAAGTGTGACCC_1 0.0000000 1.000000 0.0000000 0.0000000
lane1_AAAGGATTCGTACACA_1 1.0000000 0.000000 0.0000000 0.0000000
lane1_AAAGGGCCAGAACATA_1 0.0000000 1.000000 0.0000000 0.0000000
lane1_AAAGGGCGTTGTTGTG_1 1.0000000 0.000000 0.0000000 0.0000000
lane1_AAAGGGCTCAGGTAAA_1 1.0000000 0.000000 0.0000000 0.0000000
lane1_AAAGGTAAGCGAGTAC_1 0.0000000 1.000000 0.0000000 0.0000000
lane1_AAAGTGAGTCACCCTT_1 1.0000000 0.000000 0.0000000 0.0000000
lane1_AAAGTGAGTTGTCATG_1 1.0000000 0.000000 0.0000000 0.0000000
lane1_AACAAGACAGCTATAC_1 1.0000000 0.000000 0.0000000 0.0000000
lane1_AACAAGATCATTGCCC_1 1.0000000 0.000000 0.0000000 0.0000000
lane1_AACAAGATCTGGACCG_1 0.3124812 0.187036 0.3124812 0.1880016
lane1_AACACACGTCATCCCT_1 0.0000000 1.000000 0.0000000 0.0000000
lane1_AACAGGGAGGTGCAGT_1 0.0000000 0.000000 1.0000000 0.0000000
lane1_AACAGGGAGTCCTGTA_1 1.0000000 0.000000 0.0000000 0.0000000
lane1_AACAGGGGTAGGTACG_1 1.0000000 0.000000 0.0000000 0.0000000
test2 = data.frame(rowSums(slingPseudotime(Endocrine.sce_slingshot, na=FALSE) * wts))
rowSums.slingPseudotime.Endocrine.sce_slingshot..na...FALSE....
lane1_AAACGAAGTGTGACCC_1 NA
lane1_AAAGGATTCGTACACA_1 NA
lane1_AAAGGGCCAGAACATA_1 NA
The values aren't all NA's, but most of them are. Any ideas?
Ahh, I'm guessing that even though na=FALSE
it is still giving me NA values for some reason.
Hm, that must have been a bug with that version. The NA values correspond to weights of 0, though, so they shouldn't actually matter for your purposes because they would have no effect on the weighted average.
Let's try this again:
slingAvgPseudotime <- function(x){
wts <- slingCurveWeights(x)
wts <- wts / rowSums(wts)
pst <- slingPseudotime(x)
pst[is.na(pst)] <- 0
return(rowSums(pst * wts))
}
That works @kstreet13, thank you very much!
slingAvgPseudotime.Endocrine.sce_slingshot.
lane1_AAACGAAGTGTGACCC_1 12.30376
lane1_AAAGGATTCGTACACA_1 14.58182
lane1_AAAGGGCCAGAACATA_1 13.80268
Sure thing! I just added slingAvgPseudotime
as a helper function to the package, so versions >=2.5.2 will have this available. Thanks for the motivation!
Hello,
Thank you for the fantastic pseudotime package!
I would like to use CellRank's Pseudotime Kernel (https://cellrank.readthedocs.io/en/stable/beyond_rna_velocity.html) with my Slingshot pseudotime results, as I've found that Slingshot gives me the best results over others like DPT, Palantir, etc.
I believe that the input for CellRank should be a pseudotime value associated with each cell. I know that we can pull out these values with
slingPseudotime()
, but this gives values for each curve. Would you recommend taking the highest value per cell to use as input, or is there another part of the Slingshot results I should use?Thank you!