kstreet13 / slingshot

Functions for identifying and characterizing continuous developmental trajectories in single-cell data.
259 stars 42 forks source link

How to interpret predicted pseudo-time? #185

Closed LiJiaqi96 closed 2 years ago

LiJiaqi96 commented 2 years ago

Hi, thanks a lot for developing the slingshot pacakge. I like it a lot!

While using established trajectory to predict new data points, I found that the predicted pseudo-time may not be in the same coordinate as the trajectory. Here are my code and results:

library(slingshot)
data("slingshotExample")
rd <- slingshotExample$rd
cl <- slingshotExample$cl
pto <- slingshot(rd, cl, start.clus = '1')
rd <- rd[!is.na(slingPseudotime(pto)[,1]),]
cl <- cl[!is.na(slingPseudotime(pto)[,1])]
pto <- slingshot(rd, cl, start.clus = '1')

(I changed the example to only one branch for better illustration :)

plot(rd, pch=16, asp = 1, cex=2)
lines(SlingshotDataSet(pto), lwd=2, col='black')

The trajectory and training data are shown as follows:
ori_traj I visualized the distribution of pseudo-times on the trajectory:

hist(slingPseudotime(pto), breaks=20)

ori_pstime_hist Then I retained the data points with pseudo-time higher than 5:

plot(rd[slingPseudotime(pto)>5,], pch=16, asp = 1, cex=2)
lines(SlingshotDataSet(pto), lwd=2, col='black')

with results shown below:
sub_traj We can see that the pseudo-time is smaller on the left and higher on the right.
Next, I generated predicted data in the range ([4,6], [1,3]), which is in the similar range of a part of training data:

x <- cbind(runif(100, min = 4, max = 6), runif(100, min = 1, max = 3))

and make predictions:

pred <- predict(pto, x)

Let's have a look at these data points together with the trajectory:
pred_data_sub_traj I suppose the predicted pseudo-time(s) should large than 5 as the correspoding training data's pseudo-time is definitely larger than 5. But I drewed the histogram and found that they are smaller than 3:
pred_pstime_hist

I could not figure out how to interpret the predicted pseudo-time, as I think they should be in the same range with similar data points in the training set.

Also, could you please let me know how can I obtain the "actual pseudo-time" using slingshot package? In this way, the pseudo-time are comparable between training and predicted datasets.

Thanks a lot!

Jackie

kstreet13 commented 2 years ago

Hi @LiJiaqi96,

Thank you very much for raising this issue! Sorry for not responding sooner, but hopefully you saw that I mentioned this in the princurve repo and we are working on a solution!

Best, Kelly

LiJiaqi96 commented 2 years ago

Hi Kelly @kstreet13,

Thanks for your quick reply. It would be great if the two set of pseudo-time can be aligned. Look forward to your solution!

Best, Jackie

kstreet13 commented 2 years ago

Sorry for the delay on this while we work on a proper solution. I was thinking about it recently and realized that there is a possible workaround, which is to predict on the "full" dataset (the original data plus the new data) and then subset down to just the new data. There are some issues with this implementation, but I think it should work for most applications.

Here is the function:

# object = PseudotimeOrdering or SlingshotDataSet
# newdata = matrix of points to predict
# output = PseudotimeOrdering
temp_predict <- function(object, newdata = NULL){
    object <- as.PseudotimeOrdering(object)
    fullx <- rbind(slingReducedDim(object), newdata)
    pred <- predict(object, fullx)
    return(pred[-seq_len(nrow(object)),])
}

And some evidence that it seems to work on your example:

library(slingshot)
data("slingshotExample")
rd <- slingshotExample$rd
cl <- slingshotExample$cl
pto <- slingshot(rd, cl, start.clus = '1')
rd <- rd[!is.na(slingPseudotime(pto)[,1]),]
cl <- cl[!is.na(slingPseudotime(pto)[,1])]
pto <- slingshot(rd, cl, start.clus = '1')
x <- cbind(runif(100, min = 4, max = 6), runif(100, min = 1, max = 3))

pred1 <- predict(pto, x)
range(slingPseudotime(pred1))
# [1] 0.000000 2.559177

pred2 <- temp_predict(pto, x)
range(slingPseudotime(pred2))
# [1]  8.669684 11.228877

Hope this helps! I will be sure to update this thread when we have a more stable solution in place. Best, Kelly

LiJiaqi96 commented 2 years ago

Thanks for your suggestions. The code works on my project!
Also look forward to future update of slingshot package. I like it a lot!

Best, Jackie