Problems when fitting a linear-mixed model ('glmer') with sc-RNAseq data

aidarripoll commented 3 years ago

Hi,

we are interested in doing differential expression analysis fitting a linear-mixed model ('glmer') using sc-RNAseq data as our experiment involves a large number of individuals (exp.id) grouped in different batches (lane). Specifically, we aim to test the effect of Age (as a continuous variable) adjusting the model for the following covariates: Gender and cngenes (as fixed effects), and lane and exp.id (as random effects).

We observe the differential expression analysis works fine when fitting a glm adding the same covariates (but all of them as fixed effects), but it seems to raise some issues when fitting a glmer.

Here we attach a reproducible example which uses a testing data set host in: http://molgenis26.gcc.rug.nl/downloads/1m-scbloodn/mast/sca_MASTglm_top3degs.rds

Thank you very much! Aida

######## Install and load required packages ######## 
# Install packages
## suppressPackageStartupMessages
shhh <- suppressPackageStartupMessages

## MAST version 1.12.0
if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install("MAST",version = "3.10")
#> Bioconductor version 3.10 (BiocManager 1.30.10), R 3.6.1 (2019-07-05)
#> Installing package(s) 'MAST'
#> 
#> The downloaded binary packages are in
#>  /var/folders/d1/qxx__vcs56d7pp4tsq15g51m0000gn/T//Rtmp7ShsAu/downloaded_packages
#> Old packages: 'clue', 'Hmisc', 'Matrix.utils'

## data.table
install.packages("data.table")
#> 
#> The downloaded binary packages are in
#>  /var/folders/d1/qxx__vcs56d7pp4tsq15g51m0000gn/T//Rtmp7ShsAu/downloaded_packages

# Load packages
shhh(library(MAST))
#> Warning: package 'SummarizedExperiment' was built under R version 3.6.2
#> Warning: package 'S4Vectors' was built under R version 3.6.3
#> Warning: package 'IRanges' was built under R version 3.6.2
#> Warning: package 'GenomeInfoDb' was built under R version 3.6.3
#> Warning: package 'DelayedArray' was built under R version 3.6.3
#> Warning: package 'matrixStats' was built under R version 3.6.2
#> Warning: package 'BiocParallel' was built under R version 3.6.2
shhh(library(data.table))
#> Warning: package 'data.table' was built under R version 3.6.2

######## Read testing input file ######## 
# Model on the top 3 DEGs found previously with MAST glm (picking top 3 genes from summaryDt):
# zlmCond <- zlm(~Age + Gender + lane + cngeneson, scam.ss)
# summaryCond <- summary(zlmCond, doLRT='Age')
# summaryDt <- summaryCond$datatable
sca <- readRDS(url('http://molgenis26.gcc.rug.nl/downloads/1m-scbloodn/mast/sca_MASTglm_top3degs.rds'))

######## Cross-table: number of cells per individual (exp.id) and Age ######## 
table(sca$exp.id, sca$Age)
#>     
#>       27  28  29  35  36  38  42  43  44  45  46  48  49  50  51  52  53  56
#>   1    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  49   0   0
#>   11   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   12   0   0   0 110   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   13   0   0   0   0   0   0   0   0   0 145   0   0   0   0   0   0   0   0
#>   14   0   0   0   0   0   0   0   0   0   0   0   0   0  82   0   0   0   0
#>   15   0   0   0   0   0   0   0   0   0   0 250   0   0   0   0   0   0   0
#>   16   0   0   0   0   0   0   0 114   0   0   0   0   0   0   0   0   0   0
#>   17   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   18   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 149   0
#>   2    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   20   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   21 220   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   22   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   23   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   24   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   26   0   0   0   0   0   0   0   0   0   0   0 245   0   0   0   0   0   0
#>   27   0   0   0   0   0   0 201   0   0   0   0   0   0   0   0   0   0   0
#>   29   0   0   0   0   0   0   0   0 173   0   0   0   0   0   0   0   0   0
#>   3    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   30   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   31   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   32   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   34   0   0   0   0 188   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   35   0   0   0   0   0 244   0   0   0   0   0   0   0   0   0   0   0   0
#>   36   0   0   0   0   0   0   0   0   0   0 217   0   0   0   0   0   0   0
#>   37   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   38   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   39   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   4    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   40   0   0   0   0   0   0   0   0   0   0   0   0 119   0   0   0   0   0
#>   41   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 123
#>   42   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  84   0   0
#>   44   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   45   0   0   0   0   0   0   0   0   0   0   0   0 140   0   0   0   0   0
#>   46   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   47   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  59
#>   51   0   0   0   0   0   0   0   0   0   0   0 214   0   0   0   0   0   0
#>   52   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 217
#>   58   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   59   0   0   0   0   0   0   0   0   0   0   0   0   0 159   0   0   0   0
#>   60   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   61   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   62   0   0   0   0   0   0   0   0   0   0   0 136   0   0   0   0   0   0
#>   63   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   64   0   0   0   0 172   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   65   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   66   0   0   0   0   0   0   0   0   0   0   0   0   0   0 148   0   0   0
#>   67   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   68   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   69   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 149   0   0
#>   70   0 196   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   71   0   0   0   0   0   0   0   0   0   0   0   0   0   0  72   0   0   0
#>   72   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   73   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   76   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  94
#>   77   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   78   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   79   0   0   0   0   0 446   0   0   0   0   0   0   0   0   0   0   0   0
#>   8    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   81   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   82   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   83   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   84   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   85   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   86   0   0   0   0   0   0   0   0   0   0   0 214   0   0   0   0   0   0
#>   87   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   88   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   89   0   0 233   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   9    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   90   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   91   0   0   0   0   0   0 250   0   0   0   0   0   0   0   0   0   0   0
#>   92   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>     
#>       57  58  59  61  62  63  64  65  66  67  68  69  70  72  74  75  76  77
#>   1    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   11   0   0   0   0   0   0   0   0 178   0   0   0   0   0   0   0   0   0
#>   12   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   13   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   14   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   15   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   16   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   17   0  80   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   18   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   2    0   0   0   0   0   0   0   0   0   0 232   0   0   0   0   0   0   0
#>   20   0   0   0 224   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   21   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   22 315   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   23   0   0   0   0   0   0 111   0   0   0   0   0   0   0   0   0   0   0
#>   24   0   0   0   0   0 186   0   0   0   0   0   0   0   0   0   0   0   0
#>   26   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   27   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   29   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   3    0   0   0   0   0 275   0   0   0   0   0   0   0   0   0   0   0   0
#>   30   0   0   0 131   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   31   0   0   0   0   0  99   0   0   0   0   0   0   0   0   0   0   0   0
#>   32   0   0   0   0  59   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   34   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   35   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   36   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   37   0   0   0   0   0   0   0   0   0  83   0   0   0   0   0   0   0   0
#>   38   0   0   0   0   0 176   0   0   0   0   0   0   0   0   0   0   0   0
#>   39   0   0   0   0   0 410   0   0   0   0   0   0   0   0   0   0   0   0
#>   4    0   0   0   0   0   0   0   0   0   0  88   0   0   0   0   0   0   0
#>   40   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   41   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   42   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   44   0   0   0   0   0   0   0   0   0   0   0 154   0   0   0   0   0   0
#>   45   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   46   0   0  83   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   47   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   51   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   52   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   58   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 136   0
#>   59   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   60   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 111   0   0
#>   61   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  40   0
#>   62   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   63   0   0   0   0   0  80   0   0   0   0   0   0   0   0   0   0   0   0
#>   64   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   65   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  31   0   0
#>   66   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   67   0   0   0   0   0   0   0   0   0   0   0   0   0 177   0   0   0   0
#>   68   0   0   0   0   0   0   0   0   0   0   0   0   0   0 207   0   0   0
#>   69   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   70   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   71   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   72   0   0   0   0   0   0   0 264   0   0   0   0   0   0   0   0   0   0
#>   73   0   0   0 104   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   76   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   77   0   0   0   0   0   0   0   0   0   0   0   0   0   0 123   0   0   0
#>   78   0   0  72   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   79   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   8    0   0   0   0   0   0   0   0 131   0   0   0   0   0   0   0   0   0
#>   81   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 190   0
#>   82   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 169   0
#>   83   0   0   0   0   0   0   0   0   0   0   0   0  88   0   0   0   0   0
#>   84   0   0   0   0   0   0 123   0   0   0   0   0   0   0   0   0   0   0
#>   85   0   0 176   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   86   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   87   0   0   0 119   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   88   0   0   0   0   0   0   0   0   0 347   0   0   0   0   0   0   0   0
#>   89   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   9  221   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   90   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 353
#>   91   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
#>   92   0   0   0   0   0   0   0   0   0   0   0   0   0   0 377   0   0   0

########  Fitting models and testing with MAST ######## 
######## glm (working) ######## 
# testing contrast 'Age' // adjust by: Gender + lane + cngeneson
## model and testing
zlmCond.glm <- zlm(~Age + Gender + lane + cngeneson, 
                   sca)
#> 
#> Done!
summaryCond.glm <- summary(zlmCond.glm, doLRT='Age')
#> Combining coefficients and standard errors
#> Warning in melt(coefAndCI, as.is = TRUE): The melt generic in data.table has
#> been passed a array and will attempt to redirect to the relevant reshape2
#> method; please note that reshape2 is deprecated, and this redirection is now
#> deprecated as well. To continue using melt methods from reshape2 while both
#> libraries are attached, e.g. melt.list, you can prepend the namespace like
#> reshape2::melt(coefAndCI). In the next version, this warning will become an
#> error.
#> Calculating log-fold changes
#> Warning in melt(lfc): The melt generic in data.table has been passed a list
#> and will attempt to redirect to the relevant reshape2 method; please note that
#> reshape2 is deprecated, and this redirection is now deprecated as well. To
#> continue using melt methods from reshape2 while both libraries are attached,
#> e.g. melt.list, you can prepend the namespace like reshape2::melt(lfc). In the
#> next version, this warning will become an error.
#> Calculating likelihood ratio tests
#> Refitting on reduced model...
#> 
#> Done!
#> Warning in melt(llrt): The melt generic in data.table has been passed a list
#> and will attempt to redirect to the relevant reshape2 method; please note that
#> reshape2 is deprecated, and this redirection is now deprecated as well. To
#> continue using melt methods from reshape2 while both libraries are attached,
#> e.g. melt.list, you can prepend the namespace like reshape2::melt(llrt). In the
#> next version, this warning will become an error.
summaryDt.glm <- summaryCond.glm$datatable
summaryDt.glm
summaryDt.glm
#>      primerid component         contrast   Pr(>Chisq)       ci.hi       ci.lo
#>   1:     ACTB         C              Age 3.383874e-47  0.01161094  0.00884375
#>   2:     ACTB         C      (Intercept)           NA  2.30613716  2.09181424
#>   3:     ACTB         C          GenderM           NA -0.02017573 -0.08684229
#>   4:     ACTB         C        cngeneson           NA  0.07680029  0.04729121
#>   5:     ACTB         C lane180925_lane2           NA  0.13039859 -0.05608810
#>  ---                                                                         
#> 272:      IL8     logFC lane181213_lane1           NA  0.17465113 -0.01304182
#> 273:      IL8     logFC lane181213_lane2           NA -0.10118566 -0.36094883
#> 274:      IL8     logFC lane181213_lane3           NA -0.18641812 -0.54842390
#> 275:      IL8     logFC lane181218_lane1           NA -0.21106696 -0.40865998
#> 276:      IL8     logFC lane181218_lane2           NA -0.25687722 -0.45437608
#>             coef          z
#>   1:  0.01022735 14.4877797
#>   2:  2.19897570 40.2188744
#>   3: -0.05350901 -3.1462770
#>   4:  0.06204575  8.2420352
#>   5:  0.03715525  0.7809988
#>  ---                       
#> 272:  0.08080466  1.6875883
#> 273: -0.23106724 -3.4868953
#> 274: -0.36742101 -3.9785660
#> 275: -0.30986347 -6.1471934
#> 276: -0.35562665 -7.0584251

## format the results and take genes with FDR<=0.05
summaryDt <- summaryDt.glm
fcHurdle <- merge(summaryDt[contrast=='Age' & component=='H',.(primerid, `Pr(>Chisq)`)], #hurdle P values
                  summaryDt[contrast=='Age' & component=='logFC', .(primerid, coef, ci.hi, ci.lo)], #logFC coefficients
                  by = 'primerid')
fcHurdle[,fdr:=p.adjust(`Pr(>Chisq)`, 'fdr')] #all tested genes
fcHurdle
#>    primerid   Pr(>Chisq)        coef       ci.hi       ci.lo          fdr
#> 1:     ACTB 4.315734e-50  0.01346806  0.01661344  0.01032268 4.315734e-50
#> 2:     FTH1 1.214255e-80 -0.01143590 -0.01027217 -0.01259963 3.642764e-80
#> 3:      IL8 8.889364e-60 -0.01211227 -0.01046370 -0.01376084 1.333405e-59
fcHurdleSig <- merge(fcHurdle[fdr<.05],
                     as.data.table(mcols(sca)),
                     by='primerid')
setorder(fcHurdleSig, fdr) #DEGs (FDR<=0.05)
fcHurdleSig
#>    primerid   Pr(>Chisq)        coef       ci.hi       ci.lo          fdr
#> 1:     FTH1 1.214255e-80 -0.01143590 -0.01027217 -0.01259963 3.642764e-80
#> 2:      IL8 8.889364e-60 -0.01211227 -0.01046370 -0.01376084 1.333405e-59
#> 3:     ACTB 4.315734e-50  0.01346806  0.01661344  0.01032268 4.315734e-50

######## glmer (not working) ######## 
# testing contrast 'Age' // adjust by: Gender + cngenes (as fixed) and lane + exp.id (as random) 
### model
zlmCond.glmer <- zlm(~Age + Gender + cngeneson + (1|exp.id) + (1|lane), 
                       sca,
                       method='glmer', ebayes=FALSE)
#> boundary (singular) fit: see ?isSingular
#> boundary (singular) fit: see ?isSingular
#> Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
#> Model failed to converge with max|grad| = 0.00716707 (tol = 0.002, component 1)
#> boundary (singular) fit: see ?isSingular
#> boundary (singular) fit: see ?isSingular
#> 
#> Done!
summaryCond.glmer <- summary(zlmCond.glmer, doLRT='Age')
#> Combining coefficients and standard errors
#> Warning in melt(coefAndCI, as.is = TRUE): The melt generic in data.table has
#> been passed a array and will attempt to redirect to the relevant reshape2
#> method; please note that reshape2 is deprecated, and this redirection is now
#> deprecated as well. To continue using melt methods from reshape2 while both
#> libraries are attached, e.g. melt.list, you can prepend the namespace like
#> reshape2::melt(coefAndCI). In the next version, this warning will become an
#> error.
#> Calculating log-fold changes
#> Warning in melt(lfc): The melt generic in data.table has been passed a list
#> and will attempt to redirect to the relevant reshape2 method; please note that
#> reshape2 is deprecated, and this redirection is now deprecated as well. To
#> continue using melt methods from reshape2 while both libraries are attached,
#> e.g. melt.list, you can prepend the namespace like reshape2::melt(lfc). In the
#> next version, this warning will become an error.
#> Calculating likelihood ratio tests
#> Refitting on reduced model...
#> boundary (singular) fit: see ?isSingular
#> boundary (singular) fit: see ?isSingular
#> boundary (singular) fit: see ?isSingular
#> boundary (singular) fit: see ?isSingular
#> boundary (singular) fit: see ?isSingular
#> boundary (singular) fit: see ?isSingular
#> 
#> Done!
#> Warning in melt(llrt): The melt generic in data.table has been passed a list
#> and will attempt to redirect to the relevant reshape2 method; please note that
#> reshape2 is deprecated, and this redirection is now deprecated as well. To
#> continue using melt methods from reshape2 while both libraries are attached,
#> e.g. melt.list, you can prepend the namespace like reshape2::melt(llrt). In the
#> next version, this warning will become an error.
summaryDt.glmer <- summaryCond.glmer$datatable
summaryDt.glmer
summaryDt.glmer
#>     primerid component    contrast Pr(>Chisq)        ci.hi        ci.lo
#>  1:     ACTB         C         Age         NA  0.009654121 -0.003264819
#>  2:     ACTB         C (Intercept)         NA  3.335019629  2.594030069
#>  3:     ACTB         C     GenderM         NA  0.121293021 -0.205904060
#>  4:     ACTB         C   cngeneson         NA  0.059749211  0.030962311
#>  5:     ACTB         D         Age         NA  0.022399642 -0.018124236
#>  6:     ACTB         D (Intercept)         NA  4.268290655  1.925613688
#>  7:     ACTB         D     GenderM         NA  0.075141965 -0.873675723
#>  8:     ACTB         D   cngeneson         NA  0.878767004  0.706755206
#>  9:     ACTB         H         Age         NA           NA           NA
#> 10:     ACTB         S         Age         NA           NA           NA
#> 11:     ACTB         S (Intercept)         NA           NA           NA
#> 12:     ACTB         S     GenderM         NA           NA           NA
#> 13:     ACTB         S   cngeneson         NA           NA           NA
#> 14:     ACTB     logFC         Age         NA  0.010004350 -0.003366955
#> 15:     ACTB     logFC     GenderM         NA  0.083659450 -0.280781984
#> 16:     ACTB     logFC   cngeneson         NA  0.189549733  0.036849207
#> 17:     FTH1         C         Age         NA  0.003600683 -0.012109808
#> 18:     FTH1         C (Intercept)         NA  5.116852557  4.214984603
#> 19:     FTH1         C     GenderM         NA  0.280209359 -0.117364507
#> 20:     FTH1         C   cngeneson         NA -0.021938257 -0.044314437
#> 21:     FTH1         D         Age          1           NA           NA
#> 22:     FTH1         D (Intercept)         NA           NA           NA
#> 23:     FTH1         D     GenderM         NA           NA           NA
#> 24:     FTH1         D   cngeneson         NA           NA           NA
#> 25:     FTH1         H         Age         NA           NA           NA
#> 26:     FTH1         S         Age         NA           NA           NA
#> 27:     FTH1         S (Intercept)         NA           NA           NA
#> 28:     FTH1         S     GenderM         NA           NA           NA
#> 29:     FTH1         S   cngeneson         NA           NA           NA
#> 30:     FTH1     logFC         Age         NA          NaN          NaN
#> 31:     FTH1     logFC     GenderM         NA          NaN          NaN
#> 32:     FTH1     logFC   cngeneson         NA          NaN          NaN
#> 33:      IL8         C         Age         NA -0.001231106 -0.006889724
#> 34:      IL8         C (Intercept)         NA  2.412498577  2.090758493
#> 35:      IL8         C     GenderM         NA  0.039516331 -0.104811220
#> 36:      IL8         C   cngeneson         NA -0.187905846 -0.210790915
#> 37:      IL8         D         Age         NA -0.008997052 -0.041033647
#> 38:      IL8         D (Intercept)         NA  1.917372673  0.080344770
#> 39:      IL8         D     GenderM         NA  0.292394017 -0.520008133
#> 40:      IL8         D   cngeneson         NA  0.215025287  0.130809068
#> 41:      IL8         H         Age         NA           NA           NA
#> 42:      IL8         S         Age         NA           NA           NA
#> 43:      IL8         S (Intercept)         NA           NA           NA
#> 44:      IL8         S     GenderM         NA           NA           NA
#> 45:      IL8         S   cngeneson         NA           NA           NA
#> 46:      IL8     logFC         Age         NA -0.009479878 -0.018702567
#> 47:      IL8     logFC     GenderM         NA  0.118608845 -0.268251596
#> 48:      IL8     logFC   cngeneson         NA -0.009761774 -0.147680493
#>     primerid component    contrast Pr(>Chisq)        ci.hi        ci.lo
#>             coef           z
#>  1:  0.003194651   0.9693365
#>  2:  2.964524849  15.6827093
#>  3: -0.042305519  -0.5068339
#>  4:  0.045355761   6.1761187
#>  5:  0.002137703   0.2067828
#>  6:  3.096952171   5.1820330
#>  7: -0.399266879  -1.6495239
#>  8:  0.792761105  18.0660075
#>  9:           NA          NA
#> 10:           NA   0.8316419
#> 11:           NA  14.7536008
#> 12:           NA  -1.5247752
#> 13:           NA  17.1417718
#> 14:  0.003318697   0.9729084
#> 15: -0.098561267  -1.0601239
#> 16:  0.113199470   2.9059086
#> 17: -0.004254562  -1.0615567
#> 18:  4.665918580  20.2802025
#> 19:  0.081422426   0.8027943
#> 20: -0.033126347  -5.8031755
#> 21:           NA          NA
#> 22:           NA          NA
#> 23:           NA          NA
#> 24:           NA          NA
#> 25:           NA          NA
#> 26:           NA          NA
#> 27:           NA          NA
#> 28:           NA          NA
#> 29:           NA          NA
#> 30:          NaN         NaN
#> 31:          NaN         NaN
#> 32:          NaN         NaN
#> 33: -0.004060415  -2.8127956
#> 34:  2.251628535  27.4327698
#> 35: -0.032647444  -0.8867027
#> 36: -0.199348381 -34.1458999
#> 37: -0.025015350  -3.0608236
#> 38:  0.998858721   2.1314071
#> 39: -0.113807058  -0.5491313
#> 40:  0.172917178   8.0486027
#> 41:           NA          NA
#> 42:           NA  -4.1532760
#> 43:           NA  20.9050299
#> 44:           NA  -1.0152880
#> 45:           NA -18.4535759
#> 46: -0.014091222  -5.9892049
#> 47: -0.074821375  -0.7581401
#> 48: -0.078721134  -2.2374133
#>             coef           z

### format results
summaryDt <- summaryDt.glmer
fcHurdle <- merge(summaryDt[contrast=='Age' & component=='H',.(primerid, `Pr(>Chisq)`)], #hurdle P values
                  summaryDt[contrast=='Age' & component=='logFC', .(primerid, coef, ci.hi, ci.lo)], #logFC coefficients
                  by = 'primerid')
fcHurdle[,fdr:=p.adjust(`Pr(>Chisq)`, 'fdr')] #all tested genes
fcHurdle
#>    primerid Pr(>Chisq)         coef        ci.hi        ci.lo fdr
#> 1:     ACTB         NA  0.003318697  0.010004350 -0.003366955  NA
#> 2:     FTH1         NA          NaN          NaN          NaN  NA
#> 3:      IL8         NA -0.014091222 -0.009479878 -0.018702567  NA
fcHurdleSig <- merge(fcHurdle[fdr<.05],
                     as.data.table(mcols(sca)),
                     by='primerid')
setorder(fcHurdleSig, fdr) #DEGs (FDR<=0.05)
fcHurdleSig
#> Empty data.table (0 rows and 6 cols): primerid,Pr(>Chisq),coef,ci.hi,ci.lo,fdr

^{Created on 2020-12-04 by the reprex package (v0.3.0)}

gfinak commented 3 years ago

Several things going on:

Update to MAST 1.17.3 there are significant bug fixes that address some of your issues.
Some of the gene models don't converge when you include random effects. This is not a MAST issue. In your case FTH1 is such an example. The standard errors are nonsensical when the model fails to converge, so MAST sets the coefficients to NA, because we don't want to be doing inference using the parameter estimates we don't trust. You will still get LR test p-values as this just uses the likelihood.
You can mitigate 2) by passing nAGQ=0 to the fitting function. see here (https://www.rdocumentation.org/packages/lme4/versions/1.1-25/topics/glmer) for an explanation of what that does.

Below I reproduced what you posted, but using MAST 1.17.3 and show the effects of 3) above.

 devtools::install_github("RGLab/MAST")
#> Using github PAT from envvar GITHUB_PAT
#> Skipping install of 'MAST' from a github remote, the SHA1 (bb1e928c) has not changed since last install.
#>   Use `force = TRUE` to force installation
   suppressPackageStartupMessages({
     require(MAST)
     library(data.table)
   })
 packageVersion("MAST")
#> [1] '1.17.3'
 sca <- readRDS(url('http://molgenis26.gcc.rug.nl/downloads/1m-scbloodn/mast/sca_MASTglm_top3degs.rds'))
 zlmCond.glm <- zlm(~Age + Gender + lane + cngeneson, sca)
#> 
#> Done!
 summaryCond.glm <- summary(zlmCond.glm, doLRT='Age')
#> Combining coefficients and standard errors
#> Calculating log-fold changes
#> Calculating likelihood ratio tests
#> Refitting on reduced model...
#> 
#> Done!
 summaryDt.glm <- summaryCond.glm$datatable
 summaryDt.glm
 summaryDt.glm
#>      primerid component         contrast   Pr(>Chisq)       ci.hi       ci.lo
#>   1:     ACTB         C              Age 3.383874e-47  0.01161094  0.00884375
#>   2:     ACTB         C      (Intercept)           NA  2.30613716  2.09181424
#>   3:     ACTB         C          GenderM           NA -0.02017573 -0.08684229
#>   4:     ACTB         C        cngeneson           NA  0.07680029  0.04729121
#>   5:     ACTB         C lane180925_lane2           NA  0.13039859 -0.05608810
#>  ---                                                                         
#> 272:      IL8     logFC lane181213_lane1           NA  0.17465113 -0.01304182
#> 273:      IL8     logFC lane181213_lane2           NA -0.10118566 -0.36094883
#> 274:      IL8     logFC lane181213_lane3           NA -0.18641812 -0.54842390
#> 275:      IL8     logFC lane181218_lane1           NA -0.21106696 -0.40865998
#> 276:      IL8     logFC lane181218_lane2           NA -0.25687722 -0.45437608
#>             coef          z
#>   1:  0.01022735 14.4877797
#>   2:  2.19897570 40.2188744
#>   3: -0.05350901 -3.1462770
#>   4:  0.06204575  8.2420352
#>   5:  0.03715525  0.7809988
#>  ---                       
#> 272:  0.08080466  1.6875883
#> 273: -0.23106724 -3.4868953
#> 274: -0.36742101 -3.9785660
#> 275: -0.30986347 -6.1471934
#> 276: -0.35562665 -7.0584251
 summaryDt <- summaryDt.glm
 fcHurdle <- merge(summaryDt[contrast=='Age' & component=='H',.(primerid, `Pr(>Chisq)`)], #hurdle P values
                   summaryDt[contrast=='Age' & component=='logFC', .(primerid, coef, ci.hi, ci.lo)], #logFC coefficients
                   by = 'primerid')
 fcHurdle[,fdr:=p.adjust(`Pr(>Chisq)`, 'fdr')] #all tested genes
 fcHurdle
#>    primerid   Pr(>Chisq)        coef       ci.hi       ci.lo          fdr
#> 1:     ACTB 4.315734e-50  0.01346806  0.01661344  0.01032268 4.315734e-50
#> 2:     FTH1 1.214255e-80 -0.01143590 -0.01027217 -0.01259963 3.642764e-80
#> 3:      IL8 8.889364e-60 -0.01211227 -0.01046370 -0.01376084 1.333405e-59
 fcHurdleSig <- merge(fcHurdle[fdr<.05],
                      as.data.table(mcols(sca)),
                      by='primerid')
 setorder(fcHurdleSig, fdr) #DEGs (FDR<=0.05)
 fcHurdleSig
#>    primerid   Pr(>Chisq)        coef       ci.hi       ci.lo          fdr
#> 1:     FTH1 1.214255e-80 -0.01143590 -0.01027217 -0.01259963 3.642764e-80
#> 2:      IL8 8.889364e-60 -0.01211227 -0.01046370 -0.01376084 1.333405e-59
#> 3:     ACTB 4.315734e-50  0.01346806  0.01661344  0.01032268 4.315734e-50

 ## The mixed effect model
 zlmCond.glmer <- zlm(~Age + Gender + cngeneson + (1|exp.id) + (1|lane), 
                      sca,
                      method='glmer', ebayes=FALSE)
#> Loading required namespace: lme4
#> Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
#> Model failed to converge with max|grad| = 0.00716656 (tol = 0.002, component 1)
#> 
#> Done!
 # Note the warning above. Some models failed to converge. This indicates a problem with the design when including random effects
 # MAST will set coefficients of models to NA when  the model doesn't converge because the estimates cannot be trusted.

 summaryCond.glmer <- summary(zlmCond.glmer, doLRT='Age')
#> Combining coefficients and standard errors
#> Calculating log-fold changes
#> Calculating likelihood ratio tests
#> Refitting on reduced model...
#> 
#> Done!
 summaryDt.glmer <- summaryCond.glmer$datatable
 summaryDt.glmer
 summaryDt.glmer
#>     primerid component    contrast   Pr(>Chisq)        ci.hi        ci.lo
#>  1:     ACTB         C         Age 0.3343430522  0.009654121 -0.003264819
#>  2:     ACTB         C (Intercept)           NA  3.335019629  2.594030069
#>  3:     ACTB         C     GenderM           NA  0.121293021 -0.205904060
#>  4:     ACTB         C   cngeneson           NA  0.059749211  0.030962311
#>  5:     ACTB         D         Age 0.8373536580  0.022405743 -0.018130340
#>  6:     ACTB         D (Intercept)           NA  4.268693574  1.925210892
#>  7:     ACTB         D     GenderM           NA  0.075176062 -0.873709814
#>  8:     ACTB         D   cngeneson           NA  0.878767130  0.706755080
#>  9:     ACTB         H         Age 0.6144232612           NA           NA
#> 10:     ACTB         S         Age           NA           NA           NA
#> 11:     ACTB         S (Intercept)           NA           NA           NA
#> 12:     ACTB         S     GenderM           NA           NA           NA
#> 13:     ACTB         S   cngeneson           NA           NA           NA
#> 14:     ACTB     logFC         Age           NA  0.010004663 -0.003367269
#> 15:     ACTB     logFC     GenderM           NA  0.083665221 -0.280787748
#> 16:     ACTB     logFC   cngeneson           NA  0.189574581  0.036824351
#> 17:     FTH1         C         Age 0.2905226042  0.003600683 -0.012109808
#> 18:     FTH1         C (Intercept)           NA  5.116852557  4.214984603
#> 19:     FTH1         C     GenderM           NA  0.280209359 -0.117364507
#> 20:     FTH1         C   cngeneson           NA -0.021938257 -0.044314437
#> 21:     FTH1         D         Age 1.0000000000           NA           NA
#> 22:     FTH1         D (Intercept)           NA           NA           NA
#> 23:     FTH1         D     GenderM           NA           NA           NA
#> 24:     FTH1         D   cngeneson           NA           NA           NA
#> 25:     FTH1         H         Age 0.2905226042           NA           NA
#> 26:     FTH1         S         Age           NA           NA           NA
#> 27:     FTH1         S (Intercept)           NA           NA           NA
#> 28:     FTH1         S     GenderM           NA           NA           NA
#> 29:     FTH1         S   cngeneson           NA           NA           NA
#> 30:     FTH1     logFC         Age           NA          NaN          NaN
#> 31:     FTH1     logFC     GenderM           NA          NaN          NaN
#> 32:     FTH1     logFC   cngeneson           NA          NaN          NaN
#> 33:      IL8         C         Age 0.0063992917 -0.001231106 -0.006889724
#> 34:      IL8         C (Intercept)           NA  2.412498577  2.090758493
#> 35:      IL8         C     GenderM           NA  0.039516331 -0.104811220
#> 36:      IL8         C   cngeneson           NA -0.187905846 -0.210790915
#> 37:      IL8         D         Age 0.0030452056 -0.008989084 -0.041038727
#> 38:      IL8         D (Intercept)           NA  1.917599440  0.079885019
#> 39:      IL8         D     GenderM           NA  0.292520036 -0.520018208
#> 40:      IL8         D   cngeneson           NA  0.215021283  0.130805057
#> 41:      IL8         H         Age 0.0003013517           NA           NA
#> 42:      IL8         S         Age           NA           NA           NA
#> 43:      IL8         S (Intercept)           NA           NA           NA
#> 44:      IL8         S     GenderM           NA           NA           NA
#> 45:      IL8         S   cngeneson           NA           NA           NA
#> 46:      IL8     logFC         Age           NA -0.009478095 -0.018704055
#> 47:      IL8     logFC     GenderM           NA  0.118671547 -0.268264806
#> 48:      IL8     logFC   cngeneson           NA -0.009730956 -0.147697432
#>     primerid component    contrast   Pr(>Chisq)        ci.hi        ci.lo
#>             coef           z
#>  1:  0.003194651   0.9693365
#>  2:  2.964524849  15.6827093
#>  3: -0.042305519  -0.5068339
#>  4:  0.045355761   6.1761187
#>  5:  0.002137702   0.2067204
#>  6:  3.096952233   5.1802515
#>  7: -0.399266876  -1.6494053
#>  8:  0.792761105  18.0659810
#>  9:           NA          NA
#> 10:           NA   0.8315978
#> 11:           NA  14.7523411
#> 12:           NA  -1.5246914
#> 13:           NA  17.1417530
#> 14:  0.003318697   0.9728627
#> 15: -0.098561263  -1.0600903
#> 16:  0.113199466   2.9049629
#> 17: -0.004254562  -1.0615567
#> 18:  4.665918580  20.2802025
#> 19:  0.081422426   0.8027943
#> 20: -0.033126347  -5.8031755
#> 21:           NA          NA
#> 22:           NA          NA
#> 23:           NA          NA
#> 24:           NA          NA
#> 25:           NA          NA
#> 26:           NA          NA
#> 27:           NA          NA
#> 28:           NA          NA
#> 29:           NA          NA
#> 30:          NaN         NaN
#> 31:          NaN         NaN
#> 32:          NaN         NaN
#> 33: -0.004060415  -2.8127956
#> 34:  2.251628535  27.4327698
#> 35: -0.032647444  -0.8867027
#> 36: -0.199348381 -34.1458999
#> 37: -0.025013905  -3.0594010
#> 38:  0.998742230   2.1303623
#> 39: -0.113749086  -0.5487597
#> 40:  0.172913170   8.0484154
#> 41:           NA          NA
#> 42:           NA  -4.1522700
#> 43:           NA  20.9042912
#> 44:           NA  -1.0150252
#> 45:           NA -18.4537083
#> 46: -0.014091075  -5.9870192
#> 47: -0.074796629  -0.7577406
#> 48: -0.078714194  -2.2364416
#>             coef           z

 summaryDt <- summaryDt.glmer
 fcHurdle <- merge(summaryDt[contrast=='Age' & component=='H',.(primerid, `Pr(>Chisq)`)], #hurdle P values
                   summaryDt[contrast=='Age' & component=='logFC', .(primerid, coef, ci.hi, ci.lo)], #logFC coefficients
                   by = 'primerid')
 fcHurdle[,fdr:=p.adjust(`Pr(>Chisq)`, 'fdr')] #all tested genes
 fcHurdle
#>    primerid   Pr(>Chisq)         coef        ci.hi        ci.lo          fdr
#> 1:     ACTB 0.6144232612  0.003318697  0.010004663 -0.003367269 0.6144232612
#> 2:     FTH1 0.2905226042          NaN          NaN          NaN 0.4357839063
#> 3:      IL8 0.0003013517 -0.014091075 -0.009478095 -0.018704055 0.0009040552
 #above you can see the model for FTH1 could not be reliably fit. Coefficients are set to NA. LR test still produce s  p-value since it only requires the likelihood.
 fcHurdleSig <- merge(fcHurdle[fdr<.05],
                      as.data.table(mcols(sca)),
                      by='primerid')
 setorder(fcHurdleSig, fdr) #DEGs (FDR<=0.05)
 fcHurdleSig
#>    primerid   Pr(>Chisq)        coef        ci.hi       ci.lo          fdr
#> 1:      IL8 0.0003013517 -0.01409108 -0.009478095 -0.01870406 0.0009040552

# Let's fit the  discrete models by hand.
 y_fth1<-assay(sca)["FTH1",]
 dat<-colData(sca)
 dat<-data.frame(y_fth1,dat)
 dat$y_disc<-dat$y_fth1>0

 #discrete model
 require(lme4)
#> Loading required package: lme4
#> Loading required package: Matrix
#> 
#> Attaching package: 'Matrix'
#> The following object is masked from 'package:S4Vectors':
#> 
#>     expand
 fth1_disc_fit<-glmer(y_disc~Age+Gender+cngeneson+(1|lane)+(1|exp.id),data=dat,family="binomial",control=glmerControl(optimizer="bobyqa"))
#> Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
#> Model failed to converge with max|grad| = 0.0290885 (tol = 0.002, component 1)
 #Note the convergence warning
 summary(fth1_disc_fit) # the standard errors are problematic.
#> Generalized linear mixed model fit by maximum likelihood (Laplace
#>   Approximation) [glmerMod]
#>  Family: binomial  ( logit )
#> Formula: y_disc ~ Age + Gender + cngeneson + (1 | lane) + (1 | exp.id)
#>    Data: dat
#> Control: glmerControl(optimizer = "bobyqa")
#> 
#>      AIC      BIC   logLik deviance df.resid 
#>    130.1    174.6    -59.1    118.1    12129 
#> 
#> Scaled residuals: 
#>     Min      1Q  Median      3Q     Max 
#> -76.091   0.003   0.007   0.012   0.596 
#> 
#> Random effects:
#>  Groups Name        Variance Std.Dev.
#>  exp.id (Intercept) 1.90090  1.3787  
#>  lane   (Intercept) 0.06145  0.2479  
#> Number of obs: 12135, groups:  exp.id, 72; lane, 20
#> 
#> Fixed effects:
#>              Estimate Std. Error  z value Pr(>|z|)    
#> (Intercept) 13.412572   0.002669 5025.766   <2e-16 ***
#> Age         -0.056962   0.002489  -22.883   <2e-16 ***
#> GenderM     -0.005879   0.002669   -2.203   0.0276 *  
#> cngeneson    2.056350   0.002669  770.508   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Correlation of Fixed Effects:
#>           (Intr) Age    GendrM
#> Age       -0.002              
#> GenderM    0.000 -0.001       
#> cngeneson  0.000  0.004  0.000
#> convergence code: 0
#> Model failed to converge with max|grad| = 0.0290885 (tol = 0.002, component 1)

 # now turn off the laplace approx of the log likelihood
 fth1_disc_fit_nagq0<-glmer(y_disc~Age+Gender+cngeneson+(1|lane)+(1|exp.id),data=dat,family="binomial",nAGQ=0)
 summary(fth1_disc_fit_nagq0) # This makes more sense, but the LR test would be better.
#> Generalized linear mixed model fit by maximum likelihood (Adaptive
#>   Gauss-Hermite Quadrature, nAGQ = 0) [glmerMod]
#>  Family: binomial  ( logit )
#> Formula: y_disc ~ Age + Gender + cngeneson + (1 | lane) + (1 | exp.id)
#>    Data: dat
#> 
#>      AIC      BIC   logLik deviance df.resid 
#>    131.3    175.8    -59.7    119.3    12129 
#> 
#> Scaled residuals: 
#>     Min      1Q  Median      3Q     Max 
#> -74.635   0.005   0.009   0.017   0.549 
#> 
#> Random effects:
#>  Groups Name        Variance Std.Dev.
#>  exp.id (Intercept) 0.5629   0.7503  
#>  lane   (Intercept) 0.0000   0.0000  
#> Number of obs: 12135, groups:  exp.id, 72; lane, 20
#> 
#> Fixed effects:
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept) 13.21587    2.36960   5.577 2.44e-08 ***
#> Age         -0.06811    0.03494  -1.949   0.0512 .  
#> GenderM      0.22196    0.70130   0.316   0.7516    
#> cngeneson    2.06976    0.34837   5.941 2.83e-09 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Correlation of Fixed Effects:
#>           (Intr) Age    GendrM
#> Age       -0.930              
#> GenderM   -0.028 -0.151       
#> cngeneson  0.275  0.022 -0.031

 zlmCond.glmer_nagq0 <- zlm(~Age + Gender + cngeneson + (1|exp.id) + (1|lane), 
                      sca,
                      method='glmer', ebayes=FALSE,fitArgsD=list(nAGQ=0))
#> 
#> Done!

 summaryCond.glmer.nagq0 <- summary(zlmCond.glmer_nagq0, doLRT='Age',fitArgsD=list(nAGQ=0))
#> Combining coefficients and standard errors
#> Calculating log-fold changes
#> Calculating likelihood ratio tests
#> Refitting on reduced model...
#> 
#> Done!
 summaryDt.glmer.nagq0 <- summaryCond.glmer.nagq0$datatable
 summaryDt.glmer.nagq0
 summaryDt.glmer.nagq0
#>     primerid component    contrast   Pr(>Chisq)         ci.hi        ci.lo
#>  1:     ACTB         C         Age 0.3343430522  0.0096541205 -0.003264819
#>  2:     ACTB         C (Intercept)           NA  3.3350196289  2.594030069
#>  3:     ACTB         C     GenderM           NA  0.1212930214 -0.205904060
#>  4:     ACTB         C   cngeneson           NA  0.0597492114  0.030962311
#>  5:     ACTB         D         Age 0.8308435739  0.0206189302 -0.016768388
#>  6:     ACTB         D (Intercept)           NA  4.1227277207  1.961005077
#>  7:     ACTB         D     GenderM           NA  0.0972006380 -0.850556992
#>  8:     ACTB         D   cngeneson           NA  0.8736565705  0.700929225
#>  9:     ACTB         H         Age 0.6133508979            NA           NA
#> 10:     ACTB         S         Age           NA            NA           NA
#> 11:     ACTB         S (Intercept)           NA            NA           NA
#> 12:     ACTB         S     GenderM           NA            NA           NA
#> 13:     ACTB         S   cngeneson           NA            NA           NA
#> 14:     ACTB     logFC         Age           NA  0.0099297594 -0.003335041
#> 15:     ACTB     logFC     GenderM           NA  0.0837320703 -0.278390320
#> 16:     ACTB     logFC   cngeneson           NA  0.1901906157  0.042692142
#> 17:     FTH1         C         Age 0.2905226042  0.0036006835 -0.012109808
#> 18:     FTH1         C (Intercept)           NA  5.1168525575  4.214984603
#> 19:     FTH1         C     GenderM           NA  0.2802093590 -0.117364507
#> 20:     FTH1         C   cngeneson           NA -0.0219382575 -0.044314437
#> 21:     FTH1         D         Age 0.0547354872  0.0003660962 -0.136579712
#> 22:     FTH1         D (Intercept)           NA 17.8601916804  8.571545614
#> 23:     FTH1         D     GenderM           NA  1.5964742943 -1.152560744
#> 24:     FTH1         D   cngeneson           NA  2.7525590012  1.386970639
#> 25:     FTH1         H         Age 0.0903863613            NA           NA
#> 26:     FTH1         S         Age           NA            NA           NA
#> 27:     FTH1         S (Intercept)           NA            NA           NA
#> 28:     FTH1         S     GenderM           NA            NA           NA
#> 29:     FTH1         S   cngeneson           NA            NA           NA
#> 30:     FTH1     logFC         Age           NA  0.0036001328 -0.012110439
#> 31:     FTH1     logFC     GenderM           NA  0.2802106340 -0.117362637
#> 32:     FTH1     logFC   cngeneson           NA -0.0219307735 -0.044307053
#> 33:      IL8         C         Age 0.0063992917 -0.0012311064 -0.006889724
#> 34:      IL8         C (Intercept)           NA  2.4124985773  2.090758493
#> 35:      IL8         C     GenderM           NA  0.0395163314 -0.104811220
#> 36:      IL8         C   cngeneson           NA -0.1879058460 -0.210790915
#> 37:      IL8         D         Age 0.0030493352 -0.0087615134 -0.040816884
#> 38:      IL8         D (Intercept)           NA  1.9096521933  0.071443740
#> 39:      IL8         D     GenderM           NA  0.2918682326 -0.520561224
#> 40:      IL8         D   cngeneson           NA  0.2138835720  0.129846704
#> 41:      IL8         H         Age 0.0003017244            NA           NA
#> 42:      IL8         S         Age           NA            NA           NA
#> 43:      IL8         S (Intercept)           NA            NA           NA
#> 44:      IL8         S     GenderM           NA            NA           NA
#> 45:      IL8         S   cngeneson           NA            NA           NA
#> 46:      IL8     logFC         Age           NA -0.0093598280 -0.018690592
#> 47:      IL8     logFC     GenderM           NA  0.1188611685 -0.269264643
#> 48:      IL8     logFC   cngeneson           NA -0.0096196233 -0.147403411
#>     primerid component    contrast   Pr(>Chisq)         ci.hi        ci.lo
#>             coef            z
#>  1:  0.003194651   0.96933652
#>  2:  2.964524849  15.68270932
#>  3: -0.042305519  -0.50683395
#>  4:  0.045355761   6.17611866
#>  5:  0.001925271   0.20185784
#>  6:  3.041866399   5.51592371
#>  7: -0.376678177  -1.55794190
#>  8:  0.787292898  17.86706933
#>  9:           NA           NA
#> 10:           NA   0.82815947
#> 11:           NA  14.98969717
#> 12:           NA  -1.46001700
#> 13:           NA  17.00110127
#> 14:  0.003297359   0.97441435
#> 15: -0.097329125  -1.05357517
#> 16:  0.116441379   3.09455281
#> 17: -0.004254562  -1.06155672
#> 18:  4.665918580  20.28020251
#> 19:  0.081422426   0.80279433
#> 20: -0.033126347  -5.80317548
#> 21: -0.068106808  -1.94948487
#> 22: 13.215868647   5.57726635
#> 23:  0.221956775   0.31649454
#> 24:  2.069764820   5.94126988
#> 25:           NA           NA
#> 26:           NA  -2.12912793
#> 27:           NA  18.28399157
#> 28:           NA   0.79145675
#> 29:           NA   0.09764749
#> 30: -0.004255153  -1.06169869
#> 31:  0.081423999   0.80281104
#> 32: -0.033118913  -5.80184713
#> 33: -0.004060415  -2.81279562
#> 34:  2.251628535  27.43276981
#> 35: -0.032647444  -0.88670270
#> 36: -0.199348381 -34.14589993
#> 37: -0.024789199  -3.03137574
#> 38:  0.990547967   2.11231576
#> 39: -0.114346495  -0.55171563
#> 40:  0.171865138   8.01670719
#> 41:           NA           NA
#> 42:           NA  -4.13245320
#> 43:           NA  20.89153036
#> 44:           NA  -1.01711535
#> 45:           NA -18.47612937
#> 46: -0.014025210  -5.89210212
#> 47: -0.075201737  -0.75950989
#> 48: -0.078511517  -2.23364081
#>             coef            z

 summaryDt <- summaryDt.glmer.nagq0
 fcHurdle <- merge(summaryDt[contrast=='Age' & component=='H',.(primerid, `Pr(>Chisq)`)], #hurdle P values
                   summaryDt[contrast=='Age' & component=='logFC', .(primerid, coef, ci.hi, ci.lo)], #logFC coefficients
                   by = 'primerid')
 fcHurdle[,fdr:=p.adjust(`Pr(>Chisq)`, 'fdr')] #all tested genes
 fcHurdle
#>    primerid   Pr(>Chisq)         coef        ci.hi        ci.lo          fdr
#> 1:     ACTB 0.6133508979  0.003297359  0.009929759 -0.003335041 0.6133508979
#> 2:     FTH1 0.0903863613 -0.004255153  0.003600133 -0.012110439 0.1355795419
#> 3:      IL8 0.0003017244 -0.014025210 -0.009359828 -0.018690592 0.0009051733

 fcHurdleSig <- merge(fcHurdle[fdr<.05],
                      as.data.table(mcols(sca)),
                      by='primerid')
 setorder(fcHurdleSig, fdr) #DEGs (FDR<=0.05)
 fcHurdleSig
#>    primerid   Pr(>Chisq)        coef        ci.hi       ci.lo          fdr
#> 1:      IL8 0.0003017244 -0.01402521 -0.009359828 -0.01869059 0.0009051733

^{Created on 2020-12-04 by the reprex package (v0.3.0)}

gfinak commented 3 years ago

More on this in #98 And a good explanation of what nAGQ=0 means for the model fit and interpretation: https://stats.stackexchange.com/questions/77313/why-cant-i-match-glmer-family-binomial-output-with-manual-implementation-of-g Key being that the difference between integrating out the random effects, and not, is typically pretty small, but care is warranted.

aidarripoll commented 3 years ago

Thank you for your clear explanation, it was really helpful!

However, I run into problems when trying to install MAST from Github in order to have the correct version (MAST 1.17.3) as it seems to be one of the main issues related to the glmer problems.

Here attached you the code for: the installation (from your previous comments), Bioconductor version and sessionInfo.

# Install MAST (1.17.3)
devtools::install_github("RGLab/MAST")
#> Using github PAT from envvar GITHUB_PAT
#> Downloading GitHub repo RGLab/MAST@HEAD
#> 
#>      checking for file ‘/private/var/folders/d1/qxx__vcs56d7pp4tsq15g51m0000gn/T/RtmpRdtz1Z/remotes12dc39e6f691/RGLab-MAST-bb1e928/DESCRIPTION’ ...  ✓  checking for file ‘/private/var/folders/d1/qxx__vcs56d7pp4tsq15g51m0000gn/T/RtmpRdtz1Z/remotes12dc39e6f691/RGLab-MAST-bb1e928/DESCRIPTION’ (341ms)
#>   ─  preparing ‘MAST’:
#>      checking DESCRIPTION meta-information ...  ✓  checking DESCRIPTION meta-information
#>   ─  checking for LF line-endings in source and make files and shell scripts
#>   ─  checking for empty or unneeded directories
#>      Removed empty directory ‘MAST/docs/articles/MAITAnalysis_cache’
#>    Removed empty directory ‘MAST/docs/articles/MAITAnalysis_files’
#>   ─  looking to see if a ‘data/datalist’ file should be added
#>   ─  building ‘MAST_1.17.3.tar.gz’ (6.6s)
#>      
#> 
#> Error: Failed to install 'MAST' from GitHub:
#>   (converted from warning) installation of package '/var/folders/d1/qxx__vcs56d7pp4tsq15g51m0000gn/T//RtmpRdtz1Z/file12dc6a45c749/MAST_1.17.3.tar.gz' had non-zero exit status
suppressPackageStartupMessages({
 require(MAST)
 library(data.table)
})
#> Warning in library(package, lib.loc = lib.loc, character.only = TRUE,
#> logical.return = TRUE, : there is no package called 'MAST'
#> Warning: package 'data.table' was built under R version 3.6.2
packageVersion("MAST")
#> Error in packageVersion("MAST"): there is no package called 'MAST'

# Bioconductor version
tools:::.BioC_version_associated_with_R_version()
#> [1] '3.9'

# Session info
sessionInfo()
#> R version 3.6.1 (2019-07-05)
#> Platform: x86_64-apple-darwin15.6.0 (64-bit)
#> Running under: macOS Mojave 10.14.6
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] data.table_1.13.2
#> 
#> loaded via a namespace (and not attached):
#>  [1] knitr_1.30        magrittr_2.0.1    usethis_1.6.3     devtools_2.3.2   
#>  [5] pkgload_1.1.0     R6_2.5.0          rlang_0.4.9       fansi_0.4.1      
#>  [9] stringr_1.4.0     highr_0.8         tools_3.6.1       pkgbuild_1.1.0   
#> [13] xfun_0.19         sessioninfo_1.1.1 cli_2.2.0         withr_2.3.0      
#> [17] remotes_2.2.0     htmltools_0.5.0   ellipsis_0.3.1    rprojroot_2.0.2  
#> [21] yaml_2.2.1        assertthat_0.2.1  digest_0.6.27     crayon_1.3.4     
#> [25] processx_3.4.5    callr_3.5.1       fs_1.5.0          ps_1.4.0         
#> [29] curl_4.3          testthat_3.0.0    memoise_1.1.0     glue_1.4.2       
#> [33] evaluate_0.14     rmarkdown_2.5     stringi_1.5.3     compiler_3.6.1   
#> [37] desc_1.2.0        prettyunits_1.1.1

^{Created on 2020-12-05 by the reprex package (v0.3.0)}

gfinak commented 3 years ago

You need to upgrade your BioConductor and r version s. The current mast version requires at least BioC 3.11 and R 4. The reason is due to its dependencies in BioConductor, the latest of which are dependent on R 4 and 3.11. As BioC moves up in version, they've done a lot of improvements and changes to underlying data structures, we have to keep up. And so should you.

RGLab / MAST

Problems when fitting a linear-mixed model ('glmer') with sc-RNAseq data #148