Closed littleju777 closed 1 year ago
Wrong coefficient. With your design matrix, the perturbed_genegScramble
coefficient represents the average log-expression in the scramble group. So, if you set it as coef=
, you're testing the null hypothesis that the average log-expression for scramble is... equal to zero, which is not a very interesting hypothesis.
The null hypothesis that you want is, "the scramble and knockdown groups are the same". You can achieve this by providing a contrast=
IIRC, or you can just use a design matrix like ~ perturbed_gene
where the non-intercept coefficient directly represents the difference in log-expression (i.e., the log-fold change) between groups.
Thank you so much for the help!! I tried your two suggestions, it works perfectly. Thank you!
Best, Ming
Hi all,
I would like to compute, for each cell type("Cluster" column), which genes are differentially expressed between gZscan20 and gSramble("perturbed_gene" column) pseudosamples.
Here is my code:
it comes to only have down regugulated genes -1 0 NA Cellcycle 2682 0 29603 Eff 2547 0 29738 Exh 2313 0 29972 Int 3918 135 28232 Pro 4698 0 27587
Then I change the coef to another group:
the results remain the same: -1 0 NA Cellcycle 2682 0 29603 Eff 2547 0 29738 Exh 2313 0 29972 Int 3918 135 28232 Pro 4698 0 27587
Then I change it to :
the results change to this: -1 0 1 NA Cellcycle 0 2681 1 29603 Eff 1 2546 0 29738 Exh 0 2313 0 29972 Int 3 4045 5 28232 Pro 1 4697 0 27587
I'm a bit confused about it. Basically, I have 3 questions:
Thank you so much!!!
Ming