Closed bazyliszek closed 4 years ago
Hi Marcin, We currently do not support any other datatype than obtained from RnBeads. Thanks for pointing this out, we can work on also supporting this. We'll let you know if we have some further development on that. As a workaround, you could transform an dmrseq object to an RnBiseqSet object, which then can be used with our package. Best, Michael
Thank you Michael, I will try transformation and will give it try. I will keep this open for few more days, in case there are problems with a structure. I am sure others would like to try this approach as well. Cheer, Marcin
Good point, thanks. I would then not forget about our new package methrix:
So trying transformation:
pheno <- bs.filtered@colData sites <- bs.filtered@rowRanges meth <- bs.filtered@assays@data@listData$M/bs.filtered@assays@data@listData$Cov coverage <- bs.filtered@assays@data@listData$Cov
transobject <- RnBiseqSet(pheno, sites, meth, covg = coverage, assembly = "hg38")
I converted all objects from data frames to matrix and getting this error:
Warning: All sites have been removed, returning NULL
What is the exact structure that pheno and sites suppose to be?
Both pheno and sites should be data frames, so I guess that's not the problem. It might be that our internal structure does not match the annotation that you provided. For instance, we use "chr1" rather than "1" for the chromosomes and CpGs should have length 1 (i.e. start 10496, end 10497) and strands are typically merged. Could you post the head of each of the matrices than it might be easier for me to figure out the problem?
Great. Thank you. Here it comes:
RnBiseqSet(pheno, sites, meth, covg = coverage, assembly = "hg38") Warning: All sites have been removed, returning NULL
class(sites) [1] "data.frame" class(pheno) [1] "data.frame" class(meth) [1] "matrix" class(coverage) [1] "matrix" head(pheno,2) condition BDg sample ADF1 A GrpB ADF1 ADF2 A GrpA ADF2 head(sites,2) seqnames start end width strand index 1 chr1 16243 16243 1 1 2 chr1 16555 16555 1 2 meth[1:2, 1:2] [,1] [,2] [1,] 0.75 0.7058824 [2,] 1.00 0.8666667 coverage[1:2, 1:2] [,1] [,2] [1,] 8 17 [2,] 8 15
The sites matrix should have three columns with the names chromosome, position and strand as described in ?RnBiseqSeq of the RnBeads documentation:
sites
CpG site definition, as a data.frame with 3 variables: chromosome (of type character), position (integer) and strand (character, one of "+", "-" or "*"
Hope that solves the issue.
Looks like it did help indeed for small object (I was testing on 100000). Thank you. However when I run all 22 millions, I am getting error. Maybe I just have a problem now with server, I do not expect to have any NA in any of the matrix.
NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal, NAreal)) 3: update.ff(ret, from = initdata, delete = FALSE, bydim = bydim, BATCHSIZE = BATCHSIZE, BATCHBYTES = BATCHBYTES, VERBOSE = VERBOSE) 4: ff(rep(na.prototype, row.n), ...) 5: FUN(X[[i]], ...) 6: lapply(1:col.n, FUN = function(j) { res <- ff(rep(na.prototype, row.n), ...) if (active.closing) close.ff(res) res}) 7: lapply(1:col.n, FUN = function(j) { res <- ff(rep(na.prototype, row.n), ...) if (active.closing) close.ff(res) res}) 8: BigFfMat(row.n = nrow(region.indices), col.n = nSamples, col.names = samples(object), finalizer = bff.finalizer) 9: .local(object, ...) 10: summarize.regions(object, "strands", aggregation = aggr) 11: summarize.regions(object, "strands", aggregation = aggr) 12: RnBiseqSet(pheno, sites, meth, covg = coverage, assembly = "hg38")
Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Selection:
It looks as if you don't have sufficient main memory for this analysis. It might be possible to solve this by setting the option: rnb.options("disk.dump.bigff"=FALSE)
great. I needed to put these two to work and now I have the object ready rnb.option("disk.dump.bigff"=FALSE) rnb.option("disk.dump.big.matrices"=FALSE) Thank you!
Nice development. I notice that WGBS data can be also used here for deconvolution method. rnb.set <- load.rnb.set(system.file("extdata/small_rnbSet.zip",package="DecompPipeline"))
Can other object like bs objects from dmrseq (https://github.com/kdkorthauer/dmrseq) be imported at this step? Would be nice to test performance of this method and compare to methylCC.
Cheers, Marcin