Open faniafeby opened 3 years ago
Hi @faniafeby, could you post the unique rows of your sample description? ie adata_lcpm_1.obs[["time_point", "sample"]].drop_dupicates()
? Likely there is confounding between time and sample.
Hi @davidsebfischer , below is the unique rows of my dataset:
After the removal of the duplicates, this table only shows 4 rows out of my
n_obs × n_vars = 19330 × 16709
. I do agree that there may be confounding between time and sample. So does it mean that I can't use both of the factors together in one run? Thanks!
Yes, you have to think about what you want to model - the time effect or the time effect while reducing the between sample variance. if you want to do the latter, a trick to run GLMs is to change your setup to
time point, sample, rep
p16, S1, R1
p16, S2, R2
p16, S3, R3
adult, S4, R1
and fit ~1+time+rep+rep:time
, which regresses out the variation between R1, R2, R3
Because my purpose is the latter, so I should make a new obs to represent the sample and time point combination, and then run the diffxpy as mentioned?
Thank you for the help!
you can just add the rep
col into the .obs, you dont have to recreate it!
Same issue. Is there a way to generalize this trick if I have 8 samples for young and 8 samples for old groups (16 unique groups in total)? I think it more resembles the case with embedded effects.
Hello,
currently, I am using diffxpy for my differential analysis and tried using two factors for my
formula_loc = "time_point" and "sample"
. My data consist of 2 time points (juvenile & adult) and 7 samples for those two time points. But when I run the code, I got the error code as following:I have found a similar issue here, but it was resolved by using
as_numeric
parameter. Meanwhile, the 'sample' factor is categorical and thus can't be resolved by that method. Could you help me to resolve this problem? Thank you!I posted this beforehand in the tutorial github, while it should be here.