MarioniLab / miloR

R package implementation of Milo for testing for differential abundance in KNN graphs
https://bioconductor.org/packages/release/bioc/html/miloR.html
GNU General Public License v3.0
316 stars 20 forks source link

Can't interpret plotNhoodGraphDA and plotUMAP #320

Closed giuditta2024 closed 2 months ago

giuditta2024 commented 2 months ago

Hi all! First of all, I would like to thank you for producing this useful tool.

I already went through other issues, but I did not find any other that could solve my problem.

I'm working with a dataset of cells with different genotypes for a variant (WT, homozygous, heterozigous, 80bp deletion, 16bp deletion) each captured at different timepoints (24h, 48h, 72h, 96h).

I tested, for instance, each timepoint against all the others, like this, to compare the first timepoint 24 h versus all the others:

model <- model.matrix(~ 0 + timepoint, data=adata_design)

ave.contrast <- c("timepoint24h - (timepoint48h + timepoint72h + timepoint96h)/3")

mod.constrast <- makeContrasts(contrasts=ave.contrast, levels=model)

da_results <- testNhoods(adata_milo, 
                         design = ~ 0 + timepoint, 
                         design.df = adata_design, 
                         model.contrasts = ave.contrast, 
                         fdr.weighting="graph-overlap")

And then visualized like that:

adata_milo <- buildNhoodGraph(adata_milo)

plotUMAP(adata_milo, colour_by="Assigned_GFPgenotype") + 
   plotNhoodGraphDA(adata_milo, da_results, alpha=0.1) +
  plot_layout(guides="collect" )
image

In this first example above I colored the UMAP by genotype and I assumed that since the Nhoods in dark blue (positive logFC) are overlapping with the 16bp deletion genotype, the cells with 16bp del are more abundant in the timepoint 24h. Is that correct or am I assuming something wrong?

Then, once I test for other timepoint, for istance the timepoint 96 h vs all the others, the results are more confused:

model <- model.matrix(~ 0 + timepoint, data=adata_design)
ave.contrast <- c("timepoint96h - (timepoint24h + timepoint48h + timepoint72h)/3")
mod.constrast <- makeContrasts(contrasts=ave.contrast, levels=model)

da_results <- testNhoods(adata_milo, 
                         design = ~ 0 + timepoint, 
                         design.df = adata_design, 
                         model.contrasts = ave.contrast, 
                         fdr.weighting="graph-overlap")
adata_milo <- buildNhoodGraph(adata_milo)

plotUMAP(adata_milo, colour_by="Assigned_GFPgenotype") + plotNhoodGraphDA(adata_milo, da_results, alpha=0.1) +
  plot_layout(guides="collect" )
image

So in this case I cannot understand which genotype is more responsible for the shift here.

I apologize in advance for the very naive question, I hope I did not misintepret completely the task.

Thank you!

MikeDMorgan commented 2 months ago

You can annotate the nhoods based on single-cell level meta-data using annotateNhoods. You can then visualise the distribution of the LFCs using plotDAbeeswarm to give you an intuition about the relationships between your results and genotypes.

giuditta2024 commented 2 months ago

Thank you!