settylab / Mellon

Non-parametric density inference for single-cell analysis.
https://mellon.readthedocs.io
GNU General Public License v3.0
51 stars 1 forks source link

How to run the mellon on data obtained under different processing conditions #10

Open minghao622 opened 1 week ago

minghao622 commented 1 week ago

Hello, This is a very good tool! However, I have some questions while running the code. 1, I want to calculate the density changes of a specific cell type subpopulation under different treatments. Should I merge the data from multiple treatments and then run Mellon, or should I run Mellon separately for each treatment? 2, If I were to merge the data of multiple treatments for the process, would I need to integrate the data and then use the integrated PCA for running palantir.utils.run_diffusion_maps(adata, pca_key="integrated_pca", n_components=30) ? Thanks for any advice.

katosh commented 1 week ago

Hi @minghao622!

Thanks for your inquiry! We are currently working on establishing a differential abundance framework that will hopefully make this use case a lot easier. However, to answer your questions:

  1. If you want to compare the densities, then they should be trained separately with the .fit method. However, to get density values that you can compare, you will have to evaluate it on the merged dataset with the .predict method.
  2. Yes, palantir.utils.run_diffusion_maps should be run on the PCA of the integrated dataset. Please be aware of the potentially confounding effects of batch-effect correction though. It might be advisable to validate the robustness of any finding with respect to the batch-effect correction method.

Please note, we haven’t established the units of the density values produced by mellon. While differences in the log-density values correspond to the predicted log-fold change of cell-state abundance, the absolute values should not be interpreted at this time

Stay tuned for our upcoming work on differential cell-state abundance.