haowulab / DSS

14 stars 12 forks source link

Corroborating DSS DMRs #4

Open jckhearn opened 5 years ago

jckhearn commented 5 years ago

Hi,

I have analysed a dataset with three factors of interest and their interaction using the GLM approach in DSS. I am interested in significance of the resulting DMRs. Would an approach applying the comb-p (https://github.com/brentp/combined-pvalues) method of aggregating p-values per CpG post-hoc be valid? I was thinking of running it on all the unadjusted p-values for each DMLtest.multifactor to assess the overlap with DSS identified DMRs. Do you have concerns about this approach?

The experimental design is:

Food Age Strain
H O C32
H O C32
H O C32
H O C32
H O C32
H O C32
L O C32
L O C32
L O C32
L O C32
L O C32
L O C32
H Y C32
H Y C32
H Y C32
H Y C32
H Y C32
H Y C32
L Y C32
L Y C32
L Y C32
L Y C32
L Y C32
L Y C32
H O KA53
H O KA53
H O KA53
H O KA53
H O KA53
H O KA53
L O KA53
L O KA53
L O KA53
L O KA53
L O KA53
L O KA53
H Y KA53
H Y KA53
H Y KA53
H Y KA53
H Y KA53
H Y KA53
L Y KA53
L Y KA53
L Y KA53
L Y KA53
L Y KA53
L Y KA53

And model specification in DSS:

DMLfit = DMLfit.multiFactor(BSobj.filtered, design=design, formula=~Strain+Food+Age+Strain:Food+Strain:Age+Age:Food+Strain:Food:Age)

Secondly, as I understand it the GLM approach does not use smoothing so I can use a lower p-threshold to join the data than the default to generate DMRs.

Best wishes, Jack

haowulab commented 5 years ago

Jack,

I have never used this package and don't know the method, so I really can't comment on it. I briefly looked at their paper, sounds reasonable. However, without careful investigation (such as simulation), I cannot make a definite recommendation. You can certainly try. I will also look into it more carefully and do some simulation to evaluate it.

Hao

jckhearn commented 5 years ago

Great! I look forward to hearing the simulation results. Thanks, Jack

haowulab commented 5 years ago

I won't have time to evaluate it in recent weeks. Too busy in other stuff. I'll find some time in summer. In the meantime, you can try it on your data to see whether the results make sense.