Closed Fantasque68 closed 7 months ago
This is intriguing, the tree seems to be missing node number "10", are you also seeing this on the tutorials? igraph recently got a new version, but my own tests passed without issue. If the tutorials runs well, then send me a reduced version of your dataset (with one gene only, the important elements are in uns
, obs
and obsm
) so I can have a quick look at it
Your timely reply is greatly appreciated.
Firstly, I have run the tutorials again, all codes run smoothly without any error. scf.tl.dendrogram()
also performed normally. Besides, the same workflow worked well for another dataset of mine serveral days ago.
My igraph
version is 0.10.8
, and I haven't updated it since its installation in last year.
I have mailed the onedrive link to you.
Thanks!
It looks like running scf.tl.pseudotime
with multiple mappings lead to a loss of the progenitor segment (using a single mapping recovers it). This is likely because that segment is quite short, with the few cells composing it being too close from the two other segments. This leads them to be assigned to these in many mappings.
This indicates an instability of the learned tree, I would recommend to explore other more stable topologies.
In the future, I might implement an exception that would prevent the scf.tl.pseudotime
if it leads to an affected tree structure.
Thanks for your help, but I don't quite understand what "scf.tl.pseudotime
with multiple mappings" means, and how to using a single mapping?
- Here is the tree before cleaning.
- Here is the tree after cleaning. I chose
branch == 96
to cut.
Pseudotime is calculated as assigning to each cell to its closest principal point/node. As described in the methods, when with multiple mappings, pseudotime assignment is performed n times by selecting the closest node randomly using the values from R soft assignment matrix as probabilities.
To summarize over all mapping, the mean pseudotime is calculated: This can lead to pseudotime values being higher or lower to the assigned segment. This happens especially when ppt_sigma
is high, as this lead to higher R node soft assignment to a broader region around a given cells. In that case cells from progenitor branch would get often assigned to the two other branches due to proximity.
Running over one mapping is as simple as setting the parameter n_map=1
, but the fact that this is the solution shows that the tree lacks robustness in the first place.
Pseudotime is calculated as assigning to each cell to its closest principal point/node. As described in the methods, when with multiple mappings, pseudotime assignment is performed n times by selecting the closest node randomly using the values from R soft assignment matrix as probabilities.
To summarize over all mapping, the mean pseudotime is calculated: This can lead to pseudotime values being higher or lower to the assigned segment. This happens especially when
ppt_sigma
is high, as this lead to higher R node soft assignment to a broader region around a given cells. In that case cells from progenitor branch would get often assigned to the two other branches due to proximity.Running over one mapping is as simple as setting the parameter
n_map=1
, but the fact that this is the solution shows that the tree lacks robustness in the first place.
Thanks for your patience and time, I figured it out now. The issue will be closed.
Thanks to @LouisFaure for this great pseudotime analysis package. It has helped me a lot.
However, in recent days, I encountered an error I haven't met before, when I performing
scf.tl.dendrogram()
on my own dataset. In the past few month, this function could perform normally.The error message shows
ValueError: no such vertex
, plus the start node I selected. Such asValueError: no such vertex: '10'"
.What's that mean? And how can I solve that?
By the way, all other functions before perform
scf.tl.dendrogram()
normally in my code.Thanks!
The anndata object
The codes are pasted here.
And the error message.