Closed DAOl44732 closed 6 months ago
Hi @DAOl44732, thank you for reporting! Could you provide us with the output of palantir.plot.plot_branch_selection(ad)
? This may help us understand why there are no cells selected for the branch, and to consider changing some parameters of the branch selection step.
I'm sorry. That was my mistake. The palantir.plot.plot_branch_selection(ad) as follows:
palantir.plot.plot_branch_selection(ad) <Figure size 1500x1500 with 6 Axes> and the masks = palantir.presults.select_branch_cells(ad, eps=0) result masks = palantir.presults.select_branch_cells(ad, eps=0) print(masks) [[False False False] [False False False] [False False False] ... [False False False] [False False False] [False False False]] print("Selected cells:", masks.sum()) Selected cells: 0 Hoping for your reply!
Thank you! It seems no cells are being selected.
This could be due to two different reasons.
Depending on which version of Palantir you are using there might be NaNs in the fate probability values that prohibit a good branch selection. You can see the number of NaN in the fate probabilities by running ad.obsm["palantir_fate_probabilities"].isna().sum()
. How does your output look like.
It might be due to the stringent choice of parameters like eps=0
. While this is the value used for the tutorial dataset, it is often good to use higher values. Either leaving it at its default or setting it something higher between 0 and 1. The same is true for the parameter q
which can also be increased to be more tolerant, and to include more cells into the selection of the branches.
Thank you! The ad.obsm["palantir_fate_probabilities"].isna().sum() as follows:
ad.obsm["palantir_fate_probabilities"].isna().sum() HC_HRA001261_HRS280155_PT_TGTTTGTCAGCCTATA 3 LC_GSE200972_GSM6047625_PT_CTGCATCTCTTCCTAA 3 OC_GSE184880_GSM5599227_PT_GCACGTGCAATAGAGT 3 In addition, we adjusted the eps, but the result still showed no cells. masks = palantir.presults.select_branch_cells(ad, eps=0.99) print("Selected cells:", masks.sum()) Selected cells: 0 masks = palantir.presults.select_branch_cells(ad, eps=1)
print("Selected cells:", masks.sum()) Selected cells: 0 masks = palantir.presults.select_branch_cells(ad, eps=0) print("Selected cells:", masks.sum()) Selected cells: 0
Thanks! It seems like there are a total of 9 NaN values in the fate probabilities. This should fix the problem:
ad.obsm["palantir_fate_probabilities"] = ad.obsm["palantir_fate_probabilities"].fillna(0)
You can still play around with eps and q. Usually values between 0 and 0.2 work well.
Thank you for reporting. I will try to make this more robust in a future patch.
The latest version on Github should now also be able to do branch selection if there NaNs in the fate probabilities. You can try it out by installing it with
pip install 'git+https://github.com/dpeerlab/Palantir'
Please let me know if you need any further help with this issue!
Thank you very much! I solved the problem using the ad.obsm["palantir_fate_probabilities"] = ad.obsm["palantir_fate_probabilities"].fillna(0). The result as follows:
I would like to know exactly what cell subpopulations this time curve goes through, is this achievable? Thanks again for your answers!
I think the boolean masks generated for the branch selection might help you. You can, e.g., use the masks to subset you anndata, count, and plot the cells of the specific branches. E.g.:
branch_name = "c3_DNT_APOE"
mask = ad.obsm["branch_masks"][branch_name]
sc.pl.embedding(ad[mask, :], "umap", color="celltype")
The code above assumes that you have a column ad.obs["celltype"]
in your anndata.
Hello, how is the gene expression trend calculated by Mellon? I have read Mellon's article, but I still have questions about the process of calculating gene expression trend. Could you please describe the calculation process in detail?
Hello @yitengfei120011,
Thank you for your question! I'd be happy to clarify how Mellon calculates the gene expression trend.
Mellon models the gene expression trend using a Gaussian Process (GP). A GP is a probabilistic model where any collection of random variables (in this case, gene expression values at different pseudotime points) has a joint Gaussian distribution. The trend function we estimate is considered as a sample from this GP.
In Mellon, the covariance between function values is defined by the Matern52 kernel, which is a common choice for modeling smooth, yet flexible functions. The key parameters for this kernel are:
The covariance function essentially encodes our belief that points close in pseudotime should have more similar expression values, while points further apart might have less similar values.
Once the GP is defined, we condition the trend on the observed gene expression data across cells. This is done by leveraging the properties of the Multivariate Normal distribution, allowing us to update our prior belief (the GP) with the actual data to get a posterior distribution. The mean of this posterior distribution is the gene expression trend that Mellon estimates.
For large datasets, directly computing the trend can be computationally expensive due to the size of the covariance matrix. Mellon addresses this by using inducing "landmark" points. These points act as a sparse approximation to the full dataset, enabling efficient computation without significantly compromising accuracy. The number of inducing points is typically set to match the number of grid points where the trend is evaluated.
For a deeper understanding of these concepts, I recommend exploring resources on Gaussian Processes. Some starting points include:
I hope this helps clarify the process. Let me know if you have any further questions!
Dear Palantir developers, Palantir is a powerful tool for dealing with pseudotime, but I'm having some problems with it. These are codes I used: