dpeerlab / Palantir

Single cell trajectory detection
https://palantir.readthedocs.io
GNU General Public License v2.0
203 stars 45 forks source link

Similarity Kernel Computation #144

Open Merlin2333 opened 1 week ago

Merlin2333 commented 1 week ago

Hello,

I was wondering if it makes sense to replace x_i vector of gene expression with some other lower dimension loadings (such as PCs) in the step of computing the similarity kernel K(x_i, x_j)? Essentially, I was worried that using gene expressions will suffer from the curse of dimensionality. Not sure if it make sense here, but I would appreciate it if you can provide some explanations.

Thanks!

katosh commented 1 week ago

Hello @Merlin2333,

Thank you for the inquiry. The kernel is computed by the palantir.utils.compute_kernel function. Under the hood this function uses the anndata.obsm["X_pca"] representation to compute the kernel, and it will fail if no PCA is present. However, it also takes an pca_key argument that allows you to specify any other representation stored in anndata.obsm. You can technically use this to play around. Note, that the function specifically computes diffusion distance based on the Euclidean distances in the given representation, and other representations might change the downstream interpretation of the Palantir result. I would be intrigued to hear back about what you find out!