Closed msclz closed 6 months ago
Hi @msclz,
Thanks for the report! I don't think I've seen this issue before, but hopefully we can get to the bottom of it.
What code did you use to create the lineages
object? I assume it was getLineages
, but seeing the specific line of code might be helpful.
I think setting allow.breaks
to FALSE
was a good call, but I would generally advise using approx.points
, as it significantly speeds up the computation, in most cases.
What are the sizes of your input clusters? A lot of issues seem to come down to a very small input cluster (1 or 2 cells) or NA
values in the vector of cluster labels. If you had a vector of cluster labels, can you show the output of table(<clus.vec>)
?
Best, Kelly
Hi,
So I have used this coding to get my lineages
:
dimred <- data.log.GC@reductions$umap@cell.embeddings
clustering <- data.log.GC$integrated_snn_res.0.2
lineages <- getLineages(data = dimred,
clusterLabels = clustering,
reducedDim="umap",
start.clus="9",
end.clus="12")
I generated the clustering
object starting from a subset from a bigger Seurat object as you can see above.
Here is what I get if I look for any N/As in the clustering vector:
> any(is.na(clustering))
[1] FALSE
> length(clustering)
[1] 5510
> levels(clustering)
[1] "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12"
sames goes if I ask for NAs in the Lineages object, the return is FALSE.
Thanks again for helping me sorting this, Math
Oh, sorry I forgot to add in my previous message,
If I set approx_points
to a certain value, I get this new error message when running getCurves
> curves <-getCurves(lineages, allow.breaks = FALSE, approx_points = 300)
Error in vapply(seq_len(p), function(jj) { : values must be length 300,
but FUN(X[[1]]) result is length 5510
In addition: Warning messages:
1: In avg.jj * pct :
longer object length is not a multiple of shorter object length
2: In orig.jj * (1 - pct) :
longer object length is not a multiple of shorter object length
Sorry for the spam
Ok, sorry again for coming back at you, but I am making (some?) progress.
Apparently, removing the argument start.clus
and end.clus
during the generation of lineages allow for the code to run and for me to plot the curves.
> lineages <- getLineages(data = dimred,
+ clusterLabels = clustering,
+ reducedDim="umap",
+ #start.clus="9",
+ #end.clus="12",
+ )
> curves <-getCurves(lineages, allow.breaks = FALSE, approx_points = F)
> curves
class: PseudotimeOrdering
dim: 5510 2
metadata(4): lineages mst slingParams curves
pathStats(2): pseudotime weights
cellnames(5510): AACACGTGTTAAAGAC-1_1_1 AAGGTTCTCTCAACTT-1_1_1 ... Donor1.TTTGTCAAGAACTCGG.2_15 Donor1.TTTGTCATCCAAACTG.2_15
cellData names(2): reducedDim clusterLabels
pathnames(2): Lineage1 Lineage2
pathData names(0):
However, as you can imagine, the lines are kind of all over the place now...
Hi @msclz,
Thanks for the updates!
One thing I notice is that you don't need the argument reducedDim = "umap"
when you run getLineages
, since you are directly supplying the matrix of coordinates. That argument is for when the input data
is a SingleCellExperiment object and you want to use a dimensionality reduction from the reducedDims
slot. This might be the issue and if so, I'll need to do something about it.
Otherwise, it still might help to see table(clustering)
, as factors can sometimes be weird, especially when working with a subset of a larger dataset (factors will "remember" all the original levels, even if you subset down to remove a few).
And I'm not sure if it will help, but another thing you might try would be plotting the MST after you run getLineages
with start.clus = "9"
. You can do this the same way you'd plot the curves, but with type = "l"
(for "lineages").
Best, Kelly
Hi @kstreet13
So, I removed the reducedDim
argument, and kept my defining stat.clus
and end.clus
arguments, here is what I get when I plot the lineages.
> lineages <- getLineages(data = dimred,
+ clusterLabels = clustering,
+ #reducedDim="umap",
+ start.clus="9",
+ end.clus="12",
+ )
>
> plot(dimred, col=pal[clustering], cex = 0.5, pch = 16)
> for (i in levels(clustering)) {
+ text(mean(dimred[clustering == i, 1])+0.2, mean(dimred[clustering == i, 2])+1.5, labels = i, font = 2, cex=2)
+ } #add cluster labels
> lines(SlingshotDataSet(lineages), lwd = 3, col = "black")
I do find my three lineages: 9-1/9-0-2-11/9-0-2-4-3-10-8-7-6-5-12
So I don't think the reduceDim
argument is in question here, I do have the same errors for the curves whever or not it is present in the generationof the object lineages
.
Regarding the table(clustering)
, I also don't see anything weird from the output:
> table(clustering)
clustering
0 1 2 3 4 5 6 7 8 9 10 11 12
1150 666 665 468 437 418 367 327 255 241 200 158 158
>
One thing I noticed though: the getCurves
fonction is not working only when both start.clus
and end.clus
arguments are used in the generation of lineages
.
If I define only a start.clus="9"
, I get this result, which is better than previsouly but does not seem too match what plotting the MST gave me just prior with cluster 9 as the starting point...
> lineages <- getLineages(data = dimred,
+ clusterLabels = clustering,
+ #reducedDim="umap",
+ start.clus="9",
+ #end.clus="12",
+ )
> curves <-getCurves(lineages, allow.breaks = FALSE, approx_points = FALSE)
> plot(dimred, col=pal[clustering], cex = 0.5, pch = 16)
> for (i in levels(clustering)) {
+ text(mean(dimred[clustering == i, 1])+0.2, mean(dimred[clustering == i, 2])+1.5, labels = i, font = 2, cex=2)
+ } #add cluster labels
> lines(SlingshotDataSet(curves), lwd = 3, col = "black")
>
Thanks again for taking the time to help me out, Math
Hi @kstreet13, I am still trying to play around with the code and trying to understand where that initial error is coming from
curves <-getCurves(lineages, allow.breaks = FALSE, approx_points = FALSE)
Error in if (box.vals[1] == box.vals[5]) { :
missing value where TRUE/FALSE needed
My two main questions are
1- why does it seems that specifying a both start.clus
and end.clus
get me this error,
2- why when I only specify the first start cluster, the root of the curves does not seem to go on that specified cluster...
On that line, my latest (baby) step; apparently when I used the omega = T
argument, now weirldy the getCurves
can function with both start.clus
and end.clus
....
> lineages <- getLineages(data = dimred,
+ clusterLabels = clustering,
+ #reducedDim="umap",
+ start.clus="9",
+ end.clus=c("12","1","11"), omega=T
+ )
>
> curves <-getCurves(lineages, allow.breaks = FALSE, approx_points = F)
>
> plot(dimred, col=pal[clustering], cex = 0.5, pch = 16)
> for (i in levels(clustering)) {
+ text(mean(dimred[clustering == i, 1])+0.2, mean(dimred[clustering == i, 2])+1.5, labels = i, font = 2, cex=2)
+ } #add cluster labels
> lines(SlingshotDataSet(curves), lwd = 3, col = "black", type="c") + lines(SlingshotDataSet(curves), lwd = 3, col = "dodgerblue", type="l")
Hi @msclz ,
extend = "n"
in getCurves()
and see if that changes things.And I'm pretty surprised that setting omega = TRUE
didn't break the connection between clusters 4 and 2, as they seem pretty distant on the UMAP plot. Although there are some weird points with a mix of different cluster labels nearby, which could be causing that (Louvain clustering can sometimes create issues like this).
Best, Kelly
Hello ! First of all, I'd like to thank you for this really nice package.
I am not a trained bioinformatician, and this is my first time posting on github, so I apologize in advance if I am not being super clear.
I am trying to get some curves out of my lineages object:
Also, I am using the version slingshot_2.8.0
However, when I applied the getCurves function, I have this first error:
Then, by setting the parameters 'allow.breaks' and 'approx_points' to 'FALSE', I removed that previous error, but I am now facing this message:
From that point, I am kind of lost, Would you have any idea on how to fix this?
Thanks again,