MarioniLab / miloR

R package implementation of Milo for testing for differential abundance in KNN graphs
https://bioconductor.org/packages/release/bioc/html/miloR.html
GNU General Public License v3.0
316 stars 20 forks source link

First level loss when running testNhoods with multiple contrasts and multiple batch #303

Closed Tang-RH closed 5 months ago

Tang-RH commented 5 months ago

Thank you for developing miloR, a powerful tool for performing DA. I'm attempting to conduct DA across multiple conditions using data from different batches, and I've been following the guidance in the using contrasts tutorial. Here's my design dataframe:

> head(design)
    Sample  Transplant  Batch
    <fct>   <fct>   <fct>
BM105   BM105   Diagnosis   B1
BM109   BM109   Transplant  B1
BM113   BM113   Transplant  B1
BM117   BM117   Transplant  B1
BM119   BM119   Diagnosis   B1
BM121   BM121   1st OursB1

the Transplant column contains 4 levels ('Healthy','Diagnosis','1st','Transplant'), and the Batch column contains 3 levels (B1,B2,B3), I attempted to run the testNhoods as follows:

contrast <-  'TransplantDiagnosis - TransplantHealthy'
dares <- testNhoods(milo_merge, reduced.dim="PCA_HARMONY", model.contrasts = contrast,
                     design = ~ 0 + Batch + Transplant, 
                     design.df = design, fdr.weighting = "graph-overlap")

However, I encountered an error:

Error in eval(ej, envir = levelsenv): object 'TransplantHealthy' not found

The error is similar to the issue #101, which suggests adding a '0' ahead the design fomula to make all level present on the model matrix. I implemented this fix, but when I constructed the model matrix manually:

> model.matrix(~ 0 + Batch + Transplant, data=design) %>% head(3)
    BatchB1 BatchB2 BatchB3 TransplantDiagnosis Transplant1st   TransplantTransplant
BM105   1   0   0   1   0   0
BM109   1   0   0   0   0   1
BM113   1   0   0   0   0   1

I noticed that all three levels of the Batch column are present, but the first level of the Transplant column is missing. How can I perform a comparison between Healthy and Diagnosis with batch correction? Did I make mistake when using contrast with batch as covariate?

MikeDMorgan commented 5 months ago

Hi @Tang-RH try swapping the position of Transplant and Batch around in the model matrix formula.

Tang-RH commented 5 months ago

Thank you so much for your prompt response! I tried it and it worked. I have a quick follow-up question: does the order of covariates in the model matrix formula matter?

MikeDMorgan commented 5 months ago

Yes, this seems to be a quirk of model.matrix that I've never been able to figure out. In testNhoods (without constrasts) the last variable in the formula is taken as the test variable. Contrasts alter that behaviour, so it should be noted in the documentation clearly - I will add to the to-do list.