Closed pchiang5 closed 3 years ago
Hi pchiang5,
sorry for my delayed response.
If I would like to see the marker genes of the cluster2, how shall I get reasonable inputs into the following steps?
It depends what kind of markers we are taking about here.
sample1
and sample2
for cluster 2?In case 1), I would usually recommend to do pseudobulk, but you don't have any replicates for the treatment and control condition, so it won't work. You could run:
fit <- glm_gp(pbmcs[, pbmcs$cell_type == "cluster 2"], design = ~ treatment)
test_de(fit, contrast = `treatmentsample 1`)
However, the p-values will be unreliable because the cells are not independent replicates and the analysis is confounded by the sample effect.
In case 2), I would include the treatment
information in the initial fit, to correct the batch effect due to the sample:
pbmcs$is_cluster_2 <- pbmcs$cell_type == "cluster 2"
fit <- glm_gp(pbmcs, design = ~ treatment + is_cluster_2)
test_de(fit, contrast = is_cluster_2TRUE)
Your idea to set contrasts = cluster2- !cluster2
is nice, but unfortunately does something different! To be honest, I am surprised that the syntax is valid at all. Your contrast is equivalent to contrasts = cluster2- (1 - cluster2)
, which would be testing if cluster2
is signficantly different from the sum of all other coefficient, which is not what you mean.
Best, Constantin
Hi Constantin,
Thank you for the detailed explanation. Hope you do not mind with a more complex problem.
treatment | cell_type
A+B | cluster 1 A | cluster 1 B | cluster 1 0(None) | cluster 1 A+B | cluster 2 A | cluster 2 B | cluster 2 0 | cluster 2
In the scenario above, am I right to use the following way to identify the effect of treatment A on cluster 2?
fit <- glm_gp(pbmcs[, pbmcs$cell_type == "cluster 2"], design = ~ treatment) #Do I have to include the A+B and B groups here (probably this will reveal the A effect in the background of B)?
test_de(fit, contrast = "treatmentA")
Further, if I would like to see the synergistic effects of A+B on cluster 2, instead of A or B alone, the following input seems to regard A+B as a single unique group, instead of the combination of A and B. Does the tool inherently compare A+B with A alone and B alone to reveal the interaction? Or shall I split the treatment as treatment1 (A and 0) and treatment2 (B and 0)?
fit <- glm_gp(pbmcs[, pbmcs$cell_type == "cluster 2"], design = ~ treatment)
test_de(fit, contrast = "treatmentA+B")
Thanks again!
Po
shall I split the treatment as treatment1 (A and 0) and treatment2 (B and 0)?
Yes, I think you should code the treatment as two separate columns:
treatmentA | treatmentB | cell_type |
---|---|---|
1 | 1 | cluster_1 |
1 | 0 | cluster_1 |
0 | 1 | cluster_1 |
0 | 0 | cluster_1 |
1 | 1 | cluster_2 |
1 | 0 | cluster_2 |
0 | 1 | cluster_2 |
0 | 0 | cluster_2 |
You can then fit the model including an interaction term:
fit <- glm_gp(pbmcs[, pbmcs$cell_type == "cluster 2"],
design = ~ treatmentA + treatmentB + treatmentA:treatmentB)
# The column names tell you the identifiers that you can as contrasts
colnames(fit$Beta)
test_de(fit, contrast = `treatmentA:treatmentB`)
Thank you, Constantin.
Hello,
I have a single-cell dataset with the structure below. If I would like to see the marker genes of the cluster2, how shall I get reasonable inputs into the following steps?
fit <- glm_gp(pbmcs, design = XXX) #shall I include the treatment, cell_type, or treatment+cell_type into the 'design' paramenter?
res <- test_de(fit, contrast = cluster2- !cluster2, pseudobulk_by = treatment ) # If I have incorporated the 'treatment' in the glm_gp() above, do I still have to add the treatment into the 'pseudobulk_by' parameter?
Thank you.