dpeerlab / Palantir

Single cell trajectory detection
https://palantir.readthedocs.io
GNU General Public License v2.0
226 stars 52 forks source link

Similarity Kernel Computation #144

Closed Merlin2333 closed 4 months ago

Merlin2333 commented 5 months ago

Hello,

I was wondering if it makes sense to replace x_i vector of gene expression with some other lower dimension loadings (such as PCs) in the step of computing the similarity kernel K(x_i, x_j)? Essentially, I was worried that using gene expressions will suffer from the curse of dimensionality. Not sure if it make sense here, but I would appreciate it if you can provide some explanations.

Thanks!

katosh commented 4 months ago

Hello @Merlin2333,

Thank you for the inquiry. The kernel is computed by the palantir.utils.compute_kernel function. Under the hood this function uses the anndata.obsm["X_pca"] representation to compute the kernel, and it will fail if no PCA is present. However, it also takes an pca_key argument that allows you to specify any other representation stored in anndata.obsm. You can technically use this to play around. Note, that the function specifically computes diffusion distance based on the Euclidean distances in the given representation, and other representations might change the downstream interpretation of the Palantir result. I would be intrigued to hear back about what you find out!