kstreet13 / slingshot

Functions for identifying and characterizing continuous developmental trajectories in single-cell data.
259 stars 42 forks source link

How to find the distance matrix on version 2.0? #172

Closed wendelljpereira closed 2 years ago

wendelljpereira commented 2 years ago

Dear authors,

Thank you very much for working on this great package!

I have a code based on the dynverse's wrap of slingshot (ti_slingshot.R: https://github.com/dynverse/ti_slingshot/blob/master/package/R/ti_slingshot.R) that used to work until the slingshot's update to version 2.0.

I was able to pinpoint the problem, and it seems that, in the new version, a distance matrix is not in the list of parameters. The example below shows the issue.

## Using version 1.4.0

> require(slingshot)
> data("slingshotExample")
> sds <- getLineages(rd, cl)
Using full covariance matrix
> slingParams(sds)$dist
          1         2         3         5         4
1  0.000000  5.972164 27.398732 34.023581 33.131187
2  5.972164  0.000000  6.437212 24.506357 17.972025
3 27.398732  6.437212  0.000000  7.961784  8.349517
5 34.023581 24.506357  7.961784  0.000000 52.976486
4 33.131187 17.972025  8.349517 52.976486  0.000000
## Using version 2.0.0
> require(slingshot)
> data("slingshotExample")
> rd <- slingshotExample$rd
> cl <- slingshotExample$cl
> sds <- getLineages(rd, cl)
> slingParams(sds)$dist
NULL

I also tried this:

> sds2 <- slingshot::as.SlingshotDataSet(sds)
> slingParams(sds2)$dist
NULL

Is there a way to recover the distance matrix in version 2.0?

Thank you!

Best Regards, Wendell

kstreet13 commented 2 years ago

Hi @wendelljpereira,

Hm, thanks for bringing this to my attention! We didn't know anyone was using the matrix of distances between clusters.

I must admit that I don't fully understand what it is being used for in that code, but it looks like it only needs distances between connected clusters in the MST. And while slingshot typically uses scaled distances (based on the shapes of the clusters), it is still possible to recover the raw, Euclidean distances between cluster centers, via: E(slingMST(pto))$weight (where pto is a PseudotimeOrdering object produced by getLineages/slingshot).

If that doesn't help, then you may have to do some digging through TrajectoryUtils::createClusterMST, which is how slingshot handles MST construction since the update. Specifically, it uses the non-exported function .dist_clusters_scaled to calculate distances in the usual slingshot way (fyi, you can access this function with the triple colon: TrajectoryUtils:::.dist_clusters_scaled). There's a little adjustment done after this to ensure that distances along the diagonal are 0 and off the diagonal are positive, but otherwise, it should produce the distance matrix you are looking for.

Let me know how it goes and if there's anything else I can help with! Kelly

wendelljpereira commented 2 years ago

Hi @kstreet13,

Thank you very much for such a quick and thorough response!

To clarify why I am using this data. I generated a web application for scRNA-seq called Asc-Seurat. I added dynverse to the app to allow the usage of multiple models for trajectory inference. However, I found that the execution of the models contained in dynverse takes much longer because they wrap the model in a docker image.

Since slingshot is my favorite model, I added an option for users to execute it without the dynverse wrap, therefore a bit faster. However, I would like to keep a consistent visual representation of the trajectories, despite the model of choice of the users.

Therefore, I modified the dynverse's wrap I mentioned in my previous message, so my web application could show slingshot's inferred trajectory using the dynverses's plots, like the ones in here: https://dynverse.org/users/3-user-guide/4-visualisation/. It worked with the previous version of the slingshot, but now I am updating my app and found the issue above.

I will follow your instructions and let you know if it works.

Once again, thank you! Wendell