kdkorthauer / dmrseq

R package for Inference of differentially methylated regions (DMRs) from bisulfite sequencing
MIT License
54 stars 14 forks source link

names do not match previous names #25

Closed fhalbritter closed 5 years ago

fhalbritter commented 5 years ago

Dear developers,

I am trying to run dmrseq on one of my low-coverage WGBS datasets (many NA's -- don't know whether this might be relevant here). One tested covariate with three levels, 2 samples each -- in the test case I'm discussing here.

I keep on getting this error:

Detecting candidate regions with coefficient larger than 0.1 in magnitude.
...Chromosome chr1: Smoothed (0.52 min). 8250 regions scored (4.8 min).
...Chromosome chr10: Smoothed (0.36 min). 6866 regions scored (4.75 min).
...Chromosome chr11: Smoothed (0.36 min). 9300 regions scored (6.63 min).
...Chromosome chr12: Smoothed (0.3 min). 5670 regions scored (4.01 min).
...Chromosome chr13: Smoothed (0.35 min). 5860 regions scored (3.34 min).
...Chromosome chr14: Smoothed (0.3 min). Error in match.names(clabs, names(xi)) :
  names do not match previous names

When I exclude chr14 (and a few others that run into the same error), I still encounter the same error in a later permutation, e.g. (abbridged):

Beginning permutation 6
...Chromosome chr1: Smoothed (2.26 min). 8053 regions scored (2.77 min).
...Chromosome chr10: Smoothed (2.29 min). 6702 regions scored (2.92 min).
...Chromosome chr11: Smoothed (2.27 min). 9087 regions scored (3.57 min).
...Chromosome chr12: Smoothed (2.25 min). 5533 regions scored (3.24 min).
...Chromosome chr13: Smoothed (2.26 min). 5738 regions scored (2.59 min).
...Chromosome chr15: Smoothed (2.24 min). 5286 regions scored (2.73 min).
...Chromosome chr16: Smoothed (2.22 min). 4417 regions scored (2.53 min).
...Chromosome chr17: Smoothed (2.19 min). Error in match.names(clabs, names(xi)) :
  names do not match previous names
Calls: run ... tryCatch -> tryCatchList -> tryCatchOne -> <Anonymous>

I'm using the dmrseq(bs, groupVar) function with all default parameters and have previously excluded CpG's without coverage in any of the samples, as suggested in the tutorial (which(DelayedMatrixStats::rowSums2(getCoverage(bs, type="Cov")==0) == 0)).

Can you advise?

Thanks and regards, Florian

kdkorthauer commented 5 years ago

Hi Florian,

Can you verify which version of dmrseq you are using, and check that all other packages are up to date (using BiocManager::valid())?

If you are using a current version with updated dependencies and the error still persists, would you be able to send along a subset of the BSseq object you are using as an .rds file (either linked here or via email to keegan at jimmy.harvard.edu). That will help me to diagnose the error message.

Best, Keegan

fhalbritter commented 5 years ago

Dear Keegan,

Many thanks for your reply. As suggested, I've first made sure to update all out-of-date dependencies (there were a few). I had been using dmrseq version 1.4.2. I've now updated that to the latest GitHub release 1.5.3 (devtools::install_github("kdkorthauer/dmrseq").

> BiocManager::valid()
[1] TRUE
> sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)

[...]

dmrseq_1.5.3
bsseq_1.20.0
SummarizedExperiment_1.14.0
DelayedArray_0.10.0
BiocParallel_1.18.0
matrixStats_0.54.0
annotatr_1.10.0
TxDb.Mmusculus.UCSC.mm10.knownGene_3.4.7
GenomicFeatures_1.36.1
AnnotationDbi_1.46.0
Biobase_2.44.0
GenomicRanges_1.36.0
GenomeInfoDb_1.20.0
IRanges_2.18.0
S4Vectors_0.22.0
BiocGenerics_0.30.0
data.table_1.12.2

Unfortunately, I'm still running into the same error. As described, this happens reproducibly for chr14. I will sent you an excerpt of my dataset for this chromosome by email.

Thank you so much for your help!

Best, Florian

kdkorthauer commented 5 years ago

Hi Florian,

Thank you for sending the additional details and example dataset.

I am able to reproduce the error and tracked down the issue. It was due to not consistently assigning the reference category of the test covariate for the rare case of regions that are handled specially and filtered out since they have constant methylation across all sites and samples. I've just pushed a fix, which is available immediately on GitHub, and within a day or two from Bioc devel (3.10) and release (3.9).

Thanks again for reporting this issue and let me know if you run into any more troubles.

Best, Keegan