theislab / cellrank

CellRank: dynamics from multi-view single-cell data
https://cellrank.org
BSD 3-Clause "New" or "Revised" License
342 stars 47 forks source link

CellRank on hematopoiesis #732

Closed GabrielBaldissera closed 2 years ago

GabrielBaldissera commented 3 years ago

Hello!

Thank you for developing this nice tool!

I have tried to do velocity on my hematopoiesis data using scvelo but it turns out to be very noisy as discussed here with @VolkerBergen . I read your manuscript and thought that maybe the propagation of noise in velocity and the restriction of fates to the pseudotime graph would be helpful to see trajectories more clearlly but I guess that the biology of hematopoiesis is making it very complicated.

I have used the velocities previously calculated as described in my previous post to calculate fates. Here are some of my results:

image image

We can see cells transitioning towards 0 (blue cluster) which does not make sense biologically since it is the opposite of how the endothelial-to-hematopoetic transition would happen.

image image

I am primirially interested in seeing where the clusters close to cluster 0 are going.

Would you guys have any thoughts if/how I could use Cellrank to overcome these limitations for my hematopoiesis data? We have precedent of one paper doing velocity with hematopoesis in mice in which they have different time points. That is why we are considering getting more samples in different time points for hematopoisis, let`s say some hours of difference. However, we are not sure if that is worth it or not.

I read #611 and I was wondering If I should try an approach without infusing velocities.

Have you guys tried to validate your dataset using hematopoiesis as well? In the manuscript I saw that so far you have used lung and pancreas. Do you think that might be interesting?

Thank you very much for you time! Let me know if I was not clear!

Best,

Gabriel

Marius1311 commented 3 years ago

Hi Gabriel! Thanks for using CellRank, and thanks for opening this interesting issue. Please see this recent perspective for known limitations of RNA velocity on hematopoiesis: https://www.embopress.org/doi/full/10.15252/msb.202110282

Until some of these challenges are addressed methodologically, I would be careful with quantitative interpretation of RNA velocity in a hematopoiesis setting (RNA velocity works great in most other settings). CellRank propagates noise and resolves velocity problems in many settings (restriction to the phenotypic manifold, stochastic formulation, etc), however, if your velocity estimate is systematically biased, no method will be able to correct for that.

So for now, if you have other means to infuse directionality into your hematopoeisis data, I would go for those alternatives. In CellRank, these are

Marius1311 commented 3 years ago

@GabrielBaldissera, was this helpful?

GabrielBaldissera commented 3 years ago

Hi @Marius1311 Thanks so much!

I read the perspective paper and I was thinking about it.

So, right now we think that our direction will be to get real experimental time-points. We have zebrafish samples 26h post fertilization WT and MT and we are considering getting some new time points for those before and after 26h. We hope this will help us seeing those trajectories more clearly, regardless of velocities. Do you think that in the end of the day it could help with the velocities as well or the biology of this data will still be a huge factor hiding that trajectory?

Would you have any advice/consideration for us to include in our plan of getting more samples? If we do get more samples, I will try running CellRank on those as well.

Thank you very much for you time!!

Marius1311 commented 3 years ago

Yes, I do think getting more time-points would be good. To run CellRank on time-series data, you can then use the [real-time kernel](), which is a wrapper around Waddington OT (Schiebinger et al., Cell 2019). So make sure to read that regarding spacing of time points etc. In general, methods like Waddington OT, which are based on optimal transport, will work well if your time points are not spaced too far apart (depends on your bio process) and the asynchrony within each time point is not too large (again, depends on the bio system).

Whether this will also help with velocities is hard to say - maybe by the time you have the data, there will be dedicated velocity models around that take time-points into account. Hard to predict.

Marius1311 commented 2 years ago

Closing this for now, feel free to re-open or continue in a discussion