LouisFaure / scFates

a scalable python suite for tree inference and advanced pseudotime analysis from scRNAseq data.
https://scfates.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
49 stars 1 forks source link

Maximum complexity of trees [question] #29

Open vitkl opened 7 months ago

vitkl commented 7 months ago

Hi @LouisFaure

I am interested in using scFates to identify the tree of cell populations - but I am concerned about the complexity of the data. How complex can the tree be? For example, imagine a neurogenesis dataset where multiple region-specific stem cells (RG) give rise to 3 trees of developing neurons:

Screenshot 2024-03-03 at 01 11 08

Can the entire tree be analysed, or should it be pre-segmented into simpler parts?

LouisFaure commented 7 months ago

Hi @vitkl, SimplePPT or Elpigraph tree algorithms could quite capture the whole tree. The question is whether all segments of the tree is explained by cellular differentiation. If that is not the case (for example the source of variation of the RG base region could be cell location instead), then this path should not be included in the main tree. On the other hand, if all is explained by cellular differentiation, what would be the point of origin in the stem cell population? It might be that there is no clear point of origin, in that case scFates won't be able to resolve it. Finally, in my opinion it is fine to split the tree into multiple subtree/paths, that would facilitate analysis and visualisations. You can use build-in related functions

Hope that helps!

vitkl commented 6 months ago

Thank you for explaining. It sounds like you need to know quite a lot about the tree, in particular, you need to separate parallel differentiation processes that cannot be traced back to a shared starting point in a given dataset (such as the shared being point present in previous but unprofiled time points).