theislab / destiny

R package for single cell and other data analysis using diffusion maps
https://theislab.github.io/destiny/
GNU General Public License v3.0
69 stars 12 forks source link

Is density normalization applied to the kernel? #25

Open davisidarta opened 4 years ago

davisidarta commented 4 years ago

Hi,

I would like to know if the parameter density_norm represents a choice to wether normalize or not the Gaussian kernel. I'm particularly interested if this is the case because that's how Setty et al (https://www.nature.com/articles/s41587-019-0068-4) define their kernel prior to computing the diffusion operator, which seems an especially robust approach to single-cell data.

If that is not the case, is there any way to make it into destiny? In other words, how hard-coded is this into the implementation?

Edit: Sorry, I have one more question. Is it possible to set the alpha parameter to the diffusion operator?

flying-sheep commented 4 years ago

The default is to normalize the transition probability matrix by density:

https://github.com/theislab/destiny/blob/c262b9e13a09fe2b17e5921b36f9e53522e612c4/R/diffusionmap.r#L231

https://github.com/theislab/destiny/blob/c262b9e13a09fe2b17e5921b36f9e53522e612c4/R/diffusionmap.r#L432-L442

Is that what you mean?

davisidarta commented 4 years ago

Thanks for the answer, @flying-sheep ! That's not exactly what I meant, my question was specifically about the kernel. I'm comparing the performance of destiny vs. palantir kernel implementation. The normalization by density seems to correspond to palantir's multispace-scaling.

flying-sheep commented 4 years ago

Hmm, density_norm does what you see in the code quoted above. The kernel is applied here:

https://github.com/theislab/destiny/blob/e952e95fc93d46d2de3555245bd54c4af525e13c/R/diffusionmap.r#L408-L427

The default kernel width sigma is set to 'local' which means that it’s adapted to density:

https://github.com/theislab/destiny/blob/e952e95fc93d46d2de3555245bd54c4af525e13c/R/diffusionmap.r#L303-L304

davisidarta commented 4 years ago

Thank you for your answer, @flying-sheep.

I'm still having trouble understanding how the package relates to the original Diffusion Maps algorithm (Coifman et al., 2005) and to destiny's associated publications (Haghverdi, 2015, 2016).

I'm specially confused regarding the alpha parameter and the local scaling. According to Coifman, an adaptative kernel ('local normalized' according to Haghverdi 2016) is accompanied by an alpha parameter, which controls how much the sampling distribution is allowed to bias the diffusion operator. However, this should be independent from the kernel itself ( an anisotropic kernel can be built with alpha = 0,5 or 1, for instance). Destiny is one of the few diffusion maps packages lacking the choice of this parameter and it is not clear from documentation wether this means that alpha is simply not taken into account or set to a default value. If alpha is not taken into account, how is the Laplace-Beltrami operator approximated?

flying-sheep commented 4 years ago

Destiny is one of the few diffusion maps packages lacking the choice of this parameter

Could you please list them, I don’t know that many implementations: In R AFAIK most people are using destiny these day, in Python people are using scanpy which has parameters for its kernels, and nobody uses MATLAB anymore.


Regarding your question: destiny allows the choice between two kernels, a gaussian kernel with a global kernel width and (as scanpy) one with a kernel width that’s simply the distance to the kth nearest neighbor (as approximation for local density).

References

We started using the locally adaptive kernel after submission of the destiny paper and before the DPT paper.

  1. Coifman 2005: Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps
  2. Haghverdi 2015: Diffusion maps for high-dimensional single-cell analysis of differentiation data
  3. Angerer 2016: destiny: diffusion maps for large-scale single-cell data in R
  4. Haghverdi 2016: Diffusion pseudotime (DPT) robustly reconstructs lineage branching

The DPT paper defines it in its supplementary materials, “1.1 Locally scaled transition matrix”. α is only mentioned in Coifman (2005), “One-Parameter Family of Diffusion Maps”.