haowulab / DSS

14 stars 13 forks source link

DSS DMLtest with smoothing not consistent between DSS versions. #16

Open Christian-Heyer opened 3 years ago

Christian-Heyer commented 3 years ago

I have been using DSS for my WGBS and running into inconsistent results to an older version DSS that had been used on my Data. While investigating results found that the results from DML test when using smoothing are very different between some of DSS versions, which I was able to replicate using the test data from the vignette.

If I run the following code, once with R 3.4.2/DSS 2.26.0 and once using R 4.0.3/DSS 2.38.0 :

library(DSS)
require(bsseq)
path = file.path(system.file(package="DSS"), "extdata")
dat1.1 = read.table(file.path(path, "cond1_1.txt"), header=TRUE)
dat1.2 = read.table(file.path(path, "cond1_2.txt"), header=TRUE)
dat2.1 = read.table(file.path(path, "cond2_1.txt"), header=TRUE)
dat2.2 = read.table(file.path(path, "cond2_2.txt"), header=TRUE)
BSobj = makeBSseqData( list(dat1.1, dat1.2, dat2.1, dat2.2),
     c("C1","C2", "N1", "N2") )

dmlTest.sm = DMLtest(BSobj, group1=c("C1", "C2"), group2=c("N1", "N2"), 
                     smoothing=TRUE, equal.disp = FALSE, smoothing.span = 500)

hist(dmlTest.sm$pval)

image

image

I receive two very different p.value distributions of the dmls. A similar pattern of the p value distribution occurred in my data and changing the results significantly between the different versions when running with identical parameters.

When running DSS without smoothing, the p value distributions are identical between versions, so that seems to be fine.

Have there been any major changes to smoothing in DSS between these versions. On the basis of the p.value distributions, the older version seems to be more robust. I would appreciate if you could give me some input on what may be going on with DSS here.

PengNi commented 2 years ago

Hi authors,

I met the same issue too. In my test, when using smoothing, results of DSS-2.28.0, 2.30.0, 2.42.0 are the same, while the results of DSS-2.34.0, 2.36.0, 2.38.0, 2.40.0 are the same. So I wonder which version should be used when smoothing must be set as TRUE.

haowulab commented 2 years ago

Please use the latest version. I did modify the way to compute variance with smoothing.