kstreet13 / slingshot

Functions for identifying and characterizing continuous developmental trajectories in single-cell data.
265 stars 43 forks source link

Pick one lineage from many for conditiontest #241

Open TdzBAS opened 8 months ago

TdzBAS commented 8 months ago

Hi @kstreet13,

After using slingshot I get 5 different lineages. My dataset is integrated between two conditions. And I am interested in the genes that differ between those 2 conditions for a specific lineage (Not all lineages). How Do I know which lineage is which? How can I pick this one specific lineage ? How can I run conditiontest with only this specific lineage?

I would really appreciate any help!

thanks! Best, Tolga

kstreet13 commented 8 months ago

Hi Tolga,

There are a few ways you can tell which lineage is which. Assuming you have a PseudotimeOrdering object (pst), you can print it as a SlingshotDataSet with as.SlingshotDataSet(pst), which will tell you the sequence of clusters that each lineage was based on.

Alternatively, you can plot the different lineages in different colors. So plot(as.SlingshotDataSet(pst), col=1:5) will show each lineage in a different color, following the default R color palette (1 = black, 2 = red, 3 = green, 4 = blue, 5 = cyan).

And when you call conditionTest, just specify lineages = TRUE and it will split the results by lineage, so you can pick out the one(s) you are interested in.

Best, Kelly

TdzBAS commented 8 months ago

Hi @kstreet13,

thanks for this awesome hint! So the idea why i want to pick a lineage of specific interest is to reduce the runtime of fitgam. is it possible to fit only a specific lineage upstream from conditionTest? Or is the best way to do it, to chose my lineage of interest, look at the involved clusters and subset to them and doing subsequent fitgams with the subsettet clusters from the chosen lineage?

Best, Tolga

kstreet13 commented 8 months ago

Ah, that makes sense. I'm not sure if there's a way to do that (it's more of a tradeSeq question, so you might have more luck asking it on that repo). I know that subsetting down to just the clusters from a particular lineage shouldn't work, because you won't get the same trajectory that you would if you had included all the cells. But there might be a way to fit the full trajectory, then construct an artificial PseudotimeOrdering object with only the lineage of interest, where you still have all the cells, but many of them have weights of 0. Running fitGAM on that might be faster, depending on how it handles the 0 weights. There might also be problems with cells that are assigned small weights, as we generally assume that for a given cell, the weights across all possible lineages sum to 1.