hansenlab / bsseq

Devel repository for bsseq
36 stars 26 forks source link

Error in dmrFinder: 'stat' needs to be a column of 'bstat@stats' #103

Open RRebo opened 3 years ago

RRebo commented 3 years ago

Hi, I'm struggling to understand what I did wrong so I am sorry for these naïve recurrent questions! I am getting the error 'stat' needs to be a column of 'bstat@stats' when searching for DMRs. Here is the break down of what I have done : I have 9 bismark cytosine reports, from 3 triplicates. Below an example of my workflow using two triplicates.

#import bismark cytosine report, all samples together
bsseq=read.bismark(files = c("St-1-calyx-57.txt.gz","St-1-calyx-60.txt.gz","St-1-calyx-63.txt.gz","St-3-calyx-37.txt.gz","St-3-calyx-58.txt.gz","St-3-calyx-60.txt.gz"),
             loci = NULL,
             colData = NULL,
             rmZeroCov = FALSE,
             strandCollapse = TRUE,
             BACKEND = "HDF5Array",
             dir=tempfile("BSseq"),
             replace = TRUE,
             chunkdim = NULL,
             level = NULL,
             nThread = 1L,
             verbose = getOption("verbose")
          )

bsseq

An object of type 'BSseq' with 12709820 methylation loci 6 samples has not been smoothed Some assays are HDF5Array-backed

pData(bsseq)

DataFrame with 6 rows and 0 columns

bsseq.smooth = BSmooth(bsseq,
               h=200,
               ns=3,
               verbose = getOption("verbose"))

An object of type 'BSseq' with 12709820 methylation loci 6 samples has been smoothed with BSmooth (ns = 3, h = 200, maxGap = 100000000) Some assays are HDF5Array-backed

#get coverage
BS.cov = getCoverage(bsseq.smooth)

#keep only CpGs covered at least once in all samples
keepLoci = (rowSums(BS.cov >= 1) >= 6)
bsseq.smooth.subset01 = bsseq.smooth[ keepLoci, ]

bsseq.smooth.subset01

An object of type 'BSseq' with 3228018 methylation loci 6 samples has been smoothed with BSmooth (ns = 3, h = 200, maxGap = 100000000) Some assays are HDF5Array-backed

bsseq.calyx.tstat <- BSmooth.tstat(bsseq.smooth.subset01, 
                                    group1 = c("St-1-calyx-57.txt.gz", "St-1-calyx-60.txt.gz", "St-1-calyx-63.txt.gz"),
                                    group2 = c("St-3-calyx-37.txt.gz", "St-3-calyx-58.txt.gz", "St-3-calyx-60.txt.gz"), 
                                    estimate.var = "group2",
                                    local.correct = FALSE,
                                    verbose = TRUE)

[BSmooth.tstat] preprocessing ... done in 5127.4 sec [BSmooth.tstat] computing stats within groups ... done in 4.8 sec [BSmooth.tstat] computing stats across groups ... done in 14.8 sec

bsseq.calyx.tstat

An object of type 'BSseqTstat' with 3228018 methylation loci based on smoothed data: BSmooth (ns = 3, h = 200, maxGap = 100000000) with parameters BSmooth.tstat (local.correct = FALSE, maxGap = 100000000)

plot(bsseq.calyx.tstat)

First error I get:

Error in density.default(tstat) : 'x' contains missing values

dmrs0 <- dmrFinder(bsseq.calyx.tstat, cutoff = c(-4.6, 4.6))

And the dmr error:

Error in dmrFinder(bsseq.calyx.tstat, cutoff = c(-4.6, 4.6)) : 'stat' needs to be a column of 'bstat@stats'

So, what am I doing wrong??

Thank you

Rita

PeteHaitch commented 3 years ago

Hi Rita,

What are you trying to plot with plot(bsseq.calyx.tstat)? bsseq doesn't have a plot() function but does have plotting capabilities such as plotRegion(); perhaps you meant to use one of these?

As for the second error; are you able to share the bsseq.calyx.tstat object?

Cheers, Pete