FrederickHuangLin / ANCOMBC

Differential abundance (DA) and correlation analyses for microbial absolute abundance data
https://www.nature.com/articles/s41467-020-17041-7
104 stars 28 forks source link

Adjusting contrast matrix for 5 variables #138

Closed Abonacolta closed 1 year ago

Abonacolta commented 1 year ago

Thank you very much for the detailed tutorial you put together for ANCOMBC2. I have a quick question regarding the trend test with 5 ordered categories (A,B,C,D,E). I would like to test for a monotonically increasing trend from A -> E (i.e. which microbes are increasing in abundance along this trend). I however, do not know how to adjust the contrast matrix to account for these 5 variables.

 trend_control = list(contrast = list(matrix(c(1, 0, -1, 1),
                                                              nrow = 2, 
                                                              byrow = TRUE),
                                                       matrix(c(-1, 0, 1, -1),
                                                              nrow = 2, 
                                                              byrow = TRUE)),
                                       node = list(2, 2),
                                       solver = "ECOS",
                                       B = 100))

Also, I'm assuming I will use node = 4 for this (judging based off the example in the tutorial). Any advice on how to build this?

An example of the starting script I'm using is below:

output = ancombc2(data = PS_OBJECT, assay_name = "counts", tax_level = "Family",
                  fix_formula = "VAR1 + VAR2 + VAR3", rand_formula = NULL,
                  p_adj_method = "holm", pseudo = 0, pseudo_sens = TRUE,
                  prv_cut = 0.01, lib_cut = 0, s0_perc = 0.05,
                  group = "VAR1", struc_zero = TRUE, neg_lb = FALSE,
                  alpha = 0.05, n_cl = 2, verbose = TRUE,
                  global = TRUE, pairwise = TRUE, dunnet = TRUE, trend = TRUE,
                  iter_control = list(tol = 1e-2, max_iter = 20, 
                                      verbose = TRUE),
                  em_control = list(tol = 1e-5, max_iter = 100),
                  lme_control = lme4::lmerControl(),
                  mdfdr_control = list(fwer_ctrl_method = "holm", B = 100),
                  trend_control = list(contrast = list(matrix(c(1, 0, -1, 1),
                                                              nrow = 2, 
                                                              byrow = TRUE),
                                                       matrix(c(-1, 0, 1, -1),
                                                              nrow = 2, 
                                                              byrow = TRUE)),
                                       node = list(2, 2),
                                       solver = "ECOS",
                                       B = 100))

res_trend = output$res_trend
res_trend

Thank you for any help!

FrederickHuangLin commented 1 year ago

Hi @Abonacolta,

For a categorical variable of 5 levels (A,B,C,D,E), we are actually estimating 4 contrasts, which are (B-A, C-A, D-A, E-A), in R.

So for testing the trend of A < B < C < D < E, it is equivalent to test 0 < B - A < C - A < D - A < E - A, so we can specify the contrast matrix as follows:

# B-A    C-A     D-A    E-A
    1      0      0     0
    -1     1      0     0
    0     -1     1      0
    0     0     -1      1

In R, it should be

matrix(c(1, 0, 0, 0,
  -1, 1, 0, 0,
   0, -1, 1, 0,
   0, 0, -1, 1),
   nrow = 4, 
   byrow = TRUE)

Hope that helps!

Best, Huang

Abonacolta commented 1 year ago

Yes, that's exactly what I needed. Thanks!!