MarioniLab / miloR

R package implementation of Milo for testing for differential abundance in KNN graphs
https://bioconductor.org/packages/release/bioc/html/miloR.html
GNU General Public License v3.0
339 stars 22 forks source link

Design matrix not of full rank #227

Closed brianpenghe closed 2 years ago

brianpenghe commented 2 years ago

I was following this tutorial There must be something I missed still?

%%R -i design_df -o DA_results
## Define neighbourhoods
milo <- makeNhoods(milo, prop = 0.1, k = 20, d=100, refined = TRUE)

## Count cells in neighbourhoods
milo <- countCells(milo, meta.data = data.frame(colData(milo)), sample="batch")

## Calculate distances between cells in neighbourhoods
## for spatial FDR correction
milo <- calcNhoodDistance(milo, d=100)

## Test for differential abundance
DA_results <- testNhoods(milo, design = ~ dissection + chemistry + stage_group, design.df = design_df)

Error in glmFit.default(sely, design, offset = seloffset, dispersion = 0.05,  : 
  Design matrix not of full rank.  The following coefficients not estimable:
 stage_groupLate
MikeDMorgan commented 2 years ago

The error implies that variables in your model are either co-linear or that the design matrix contains more columns than rows. You can inspect the design matrix with model.matrix(~dissection + chemistry + stage_group, data=design_df). An alternative issue could be that you have a perfect confounding between the Late level of stage_group (as indicated by the error message) and one of the other variables in your model.

I would strongly recommend that you check whether your design is valid before running any testing.

brianpenghe commented 2 years ago

Indeed removing dissection changed the error messages. Thanks Mike.