Open lizgehret opened 1 month ago
Update: this also fails while using the interaction term :
I tested this with the following command (using PD mice dataset):
qiime composition ancombc \
--i-data table.qza \
--m-metadata-file metadata.tsv \
--p-formula 'donor:genotype + donor + genotype' \
--p-reference-levels 'donor::hc_1' 'genotype::wild type' \
--o-differentials diff.qza \
--verbose
This is the error message:
The following variables specified are not in the meta data: donor:genotype
Another update: this appears to be due to the following code change in ancombc: https://github.com/FrederickHuangLin/ANCOMBC/blob/d402833f7d5ca5033132a0abba63e06674c7b6b1/R/ancombc_prep.R#L132
This change only permits additive interaction terms.
# what should happen:
vars = rownames(attr(terms(formula), 'factors'))
# what currently happens:
vars = unlist(strsplit(formula, split = "\\s*\\+\\s*"))
It appears to be intentional: https://github.com/FrederickHuangLin/ANCOMBC/issues/141
If I remember correctly, we don't use group
because it's naive to the contrast for a given factor and the resulting p-value mask seemed... strange.
I initially discovered this while building the 2024.5 docs, but also replicated locally. Within a 2024.5 amplicon environment (on mac OS) the command in PD mice that utilizes ancombc with
donor * genotype
fails with the following error message:This doesn't occur in 2024.2. I need to investigate further, but something seems to have changed with the input handling for ancombc. This error doesn't occur when swapping out '*' for '+' in this particular example.