Closed paulyashna closed 4 years ago
A few things:
if you have a daFrame
object, that means your versions of CATALYST
(and probably diffcyt
) are a bit old. To better capitalize on Bioconductor infrastructure, daFrame
objects were removed as of BioC 3.10 release (October 2019).
My "condition" field of daframe object has 4 conditions; out of which 3 conditions are several kinds of tumors and the last condition is control. I built a contrast (0,0,0,1); for differential analysis between all tumor combined and the control. .. to me, it doesn't seem like that contrast would test all tumors versus control. Wouldn't it need to be something like (-1/3,-1/3,-1/3,1) to test that?
If it were me, I would make a design matrix with model.matrix(0+condition)
and then call testDA_voom()
/testDA_edgeR()
multiple times, once for each contrast of interest.
If you are doing only differential abundance analysis, then it may be easiest to output the count table and put it directly into edgeR
/limma
(or whatever tool your prefer), where running multiple contrasts simultaneously can be easily done.
I have the following doubt over diffcyt() contrasts.
I have a daframe object from CATALYST; which I now want to input for differential abundance analysis. My "condition" field of daframe object has 4 conditions; out of which 3 conditions are several kinds of tumors and the last condition is control. I built a contrast (0,0,0,1); for differential analysis between all tumor combined and the control. However now I want a contrast between a particular tumor type and all the controls. I am aware each coefficient in the contrast should correspond to one of the condition; but I am not sure how to mention the particular contrast i want.
Since, diffcyt internally includes edgeR and limma steps I could individually for each cell population do the usual limma steps:
**design <- model.matrix(~condition) fit <- lmFit(eset, design) fit2 <- eBayes(fit) de_genes <- topTable(fit2)
edgeR set of commands
exprDesign <- model.matrix(~condition) propData <- DGEList(count_table, lib.size = colSums(count_table)) # BE VERY CAREFUL ABOUT YOUR lib.size parameter, this is the number of cells you collected. If you perform a hierarchical clustering (i.e. Citrus), your lib.size is not the sum of your columns.
fit <- estimateDisp(propData, design) # this is the first deviation from limma, estimating dispersion based on your design help accounts for the variability in cell counting fit <- glmQLFit(fit, design, robust=TRUE) # this is analogous to the lmFit step above...**
But is there a way I could do this with diffcyt?
Thanks for your time. I will be grateful if you could comment on this.