hansenlab / tricycle

23 stars 8 forks source link

Outliers in tricycle embedding #2

Open lazappi opened 3 years ago

lazappi commented 3 years ago

Hi. I have been trying out tricycle but my results look a bit strange. Instead of getting a nice circular embedding I get a tight ball of most cells with a few outliers:

sce <- tricycle::project_cycle_space(sce, exprs_values = "logcounts", species = "human", gname.type = "SYMBOL")
sce <- tricycle::estimate_cycle_position(sce)
tricycle::plot_emb_circle_scale(sce, dimred = "tricycleEmbedding")

image

The outliers don't seem to be related to anything obvious so wondering if you have any ideas for what is going on here or suggestions for things to look at?

Thanks!

sjczheng commented 3 years ago

Hi Luke,

Thanks for your interest. Yes. It seems that in your data, the most of cells are not cycling, which shifts the entire ellipsoid. The shift is caused by mean-centering genes for projection. We don't recommend using tricycle on a dataset when the majority of cells are not cycling. A temporary solution is to change the center in the estimate_cycle_position function. We will try to optimize the mean-centering step to fix the shifting issue. I will let you know when we update the function.

Other than that, the ellipsoid itself should capture the cell cycle progression.

Best, Shijie

lazappi commented 3 years ago

Ok, thanks for the explanation!

kasperdanielhansen commented 3 years ago

We are thinking of how to auto-detect this. Are you surprised to learn that most cells are not cycling?

I am also going to modify this "we don't recommend using tricycle" to "you need to do something special in that case". I am pretty sure the cell cycle position estimates for all the outliers are correct after you have shifted the origin. If few cells are cycling, you can argue that getting cycle position information for those cells are not going to help you a lot. But it you need it, I would use the tricycle estimates - provided the origin is shifted.

On Wed, Jun 16, 2021 at 10:08 AM Luke Zappia @.***> wrote:

Ok, thanks for the explanation!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/hansenlab/tricycle/issues/2#issuecomment-862148737, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABF2DH3NBYA4IX273L5JYWLTTBLXBANCNFSM46XOPQ7Q .

-- Best, Kasper

lazappi commented 3 years ago

It probably makes sense in this case that most cells aren't cycling. I hadn't thought about that before but I can see how it could affect the projection. Cell cycle isn't the main thing of interest here, just looking for some kind of indication of cycling activity to check later on in the analysis.

Happy to test out things if you come up with a way of automating the origin shift.