kdkorthauer / dmrseq

R package for Inference of differentially methylated regions (DMRs) from bisulfite sequencing
MIT License
54 stars 14 forks source link

Error in names(df2) <- paste0("beta_", levels(as.factor(dat$g.fac))[-1]) #26

Closed MohamedRefaat92 closed 5 years ago

MohamedRefaat92 commented 5 years ago

Dear Developers,

I am running the development version of dmrseq on a sparse WGBS data and I get the following error at the start of the permutation step:

Beginning permutation 1 ...Chromosome chr1: Smoothed (0.99 min). 302 CpG(s) excluded due to zero coverage. Error in names(df2) <- paste0("beta_", levels(as.factor(dat$g.fac))[-1]) : 'names' attribute [4] must be the same length as the vector [1] Calls: dmrseq ... regionScanner -> do.call -> lapply -> lapply -> FUN -> asin.gls.cov

I would appreciate it if you could explain the cause of this error.

Best. Mohamed

kdkorthauer commented 5 years ago

Hi @MohamedRefaat92,

Thanks for reaching out.

I'm not able to reproduce that error. Would it be possible for you to send me a small subset of your bsseq object as an .rds file? Just chromosome 1 should do. You can send via email to keegan@jimmy.harvard.edu. This will help me track down the problem.

In addition, could you also send me the code used to generate the error?

Best, Keegan

kdkorthauer commented 5 years ago

Hi @MohamedRefaat92,

After thinking some more about what may cause this, I think I may have tracked down the problem - there was a bug in how regions were being filtered if they have constant methylation values across all samples when using a continuous covariate.

I've just pushed a fix (available on GitHub immediately, or through Bioconductor Release and Devel in the next day or two). Can you reinstall the new version and see if that fixes the problem? If not, then it would be helpful to send me the example object and code as mentioned in my previous message.

Best, Keegan

MohamedRefaat92 commented 5 years ago

Hi @kdkorthauer

I have followed your instructions by first installing the new version devtools::install_github("kdkorthauer/dmrseq") and then re-run the analysis. Unfortunately, the problem persists but with a slightly different error. Error in asin.gls.cov(ix = ind[Index], design = design, coeff = coeff) : promise already under evaluation: recursive default argument reference or earlier problems? Calls: dmrseq ... regionScanner -> do.call -> lapply -> lapply -> FUN -> asin.gls.cov

The example object was sent to your email. Thanks for your support.

Best regards, Mohamed

kdkorthauer commented 5 years ago

Hi Mohamed,

Thank you for uploading the chromosome 14 object to the Dropbox link I provided. I was able to track down the issue and just pushed a fix (the bug was introduced by a recent commit). Using the latest version, I am able to run dmrseq successfully on the object. The latest versions are available on Github immediately and through Bioc Devel and Release in the next couple of days. Please let me know if you continue to have any troubles.

As a side note, it seems that the object you sent might have been created using an earlier version of bsseq since the file size is so large compared to the number of loci in the object (see this related issue: https://github.com/hansenlab/bsseq/issues/75). I was able to rebuild/resave the object by building it with the BSseq function like so:

library(bsseq)
bs <- readRDS("Mohamed Shoeb - chr14.rds")
bs_new <- BSseq(M = getCoverage(bs, type = "M"),
                Cov = getCoverage(bs, type = "Cov"),
                pos = start(bs),
                chr = as.character(seqnames(bs)))
pData(bs_new) <- pData(bs)
saveRDS(bs_new, file="exData_MohamedShoeb.rds")

The resulting rds file is only 56Mb (> 300X smaller than the original 19Gb).

Best, Keegan

MohamedRefaat92 commented 5 years ago

Hi Keegan,

Thanks for your effort and overall input. After installing the github verion of dmrseq, the error seems resolved now. But I got a new error Error in match.names(clabs, names(xi)) : names do not match previous names

After searching the error, I found that it was mentioned in previous issues(kdkorthauer/dmrseq#25 and kdkorthauer/dmrseq#22).

Do you have any idea what might be causing this error?

Best, Mohamed

kdkorthauer commented 5 years ago

Hi Mohamed,

Thanks for catching this. Yes, you are correct that this is related to issues #25 and #22, and your data presented yet another case that I hadn't anticipated due to the sparsity of the dataset. I have just pushed a fix for this, and hopefully things will go smoothly for you now.

Sorry for the trouble and thanks again!

Best, Keegan

MohamedRefaat92 commented 5 years ago

Hi Keegan,

I believe it's working now! Thank you for your effort in maintaining the package.

Best, Mohamed Shoeb