hansenlab / minfi

Devel repository for minfi
58 stars 68 forks source link

Error in convertArray: all(probes1$ProbeSeqA == probes2$ProbeSeqB) is not TRUE #85

Closed ekarlins closed 7 years ago

ekarlins commented 7 years ago

When I try to convert an EPIC RGSet to a 450k RGSet using minfi 1.20.0 I get the following error:

rgset450k <- convertArray(RGsetEpic, "IlluminaHumanMethylation450k") [convertArray] Casting as IlluminaHumanMethylation450k Error: all(probes1$ProbeSeqA == probes2$ProbeSeqB) is not TRUE

It looks like the problem is in lines 303 and 304 of https://github.com/kasperdanielhansen/minfi/blob/master/R/combineArrays.R

The code reads: stopifnot(all(probes1$ProbeSeqA == probes2$ProbeSeqB)) stopifnot(all(probes1$ProbeSeqB == probes2$ProbeSeqA))

So currently it's: A == B B == A

If you change it to: A == A B == B

The code runs as expected. Is this a typo?

Thanks! Eric

kasperdanielhansen commented 7 years ago

This code is being tested on the test machines. It also works on my machine:

library(minfiDataEPIC)
rgset450k <- convertArray(RGsetEPIC, "IlluminaHumanMethylation450k")
biocValid()

I am thinking that your versions are not in sync., what is the output of

biocValid()
sessionInfo()

My sessionInfo() is

> sessionInfo()
R version 3.3.1 Patched (2016-08-19 r71122)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra (10.12.1)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] IlluminaHumanMethylation450kmanifest_0.4.0
 [2] minfiDataEPIC_1.0.0
 [3] IlluminaHumanMethylationEPICanno.ilm10b2.hg19_0.6.0
 [4] IlluminaHumanMethylationEPICmanifest_0.3.0
 [5] minfi_1.20.0
 [6] bumphunter_1.14.0
 [7] locfit_1.5-9.1
 [8] iterators_1.0.8
 [9] foreach_1.4.3
[10] Biostrings_2.42.0
[11] XVector_0.14.0
[12] SummarizedExperiment_1.4.0
[13] GenomicRanges_1.26.1
[14] GenomeInfoDb_1.10.1
[15] IRanges_2.8.1
[16] S4Vectors_0.12.0
[17] Biobase_2.34.0
[18] BiocGenerics_0.20.0
[19] BiocInstaller_1.24.0

loaded via a namespace (and not attached):
 [1] mclust_5.2               base64_2.0               Rcpp_0.12.7
 [4] lattice_0.20-33          Rsamtools_1.26.1         digest_0.6.10
 [7] R6_2.2.0                 plyr_1.8.4               chron_2.3-47
[10] RSQLite_1.0.0            httr_1.2.1               zlibbioc_1.20.0
[13] GenomicFeatures_1.26.0   data.table_1.9.6         annotate_1.52.0
[16] Matrix_1.2-6             preprocessCore_1.36.0    splines_3.3.1
[19] BiocParallel_1.8.1       stringr_1.1.0            RCurl_1.95-4.8
[22] biomaRt_2.30.0           rtracklayer_1.34.1       multtest_2.30.0
[25] pkgmaker_0.22            openssl_0.9.5            GEOquery_2.40.0
[28] quadprog_1.5-5           codetools_0.2-14         matrixStats_0.51.0
[31] XML_3.98-1.5             reshape_0.8.6
 GenomicAlignments_1.10.0
[34] MASS_7.3-45              bitops_1.0-6             grid_3.3.1
[37] nlme_3.1-128             xtable_1.8-2             registry_0.3
[40] DBI_0.5-1                magrittr_1.5             stringi_1.1.2
[43] genefilter_1.56.0        doRNG_1.6                limma_3.30.2
[46] nor1mix_1.2-2            RColorBrewer_1.1-2       siggenes_1.48.0
[49] tools_3.3.1              illuminaio_0.16.0        rngtools_1.2.4
[52] survival_2.40-1          AnnotationDbi_1.36.0     beanplot_1.2

On Thu, Nov 10, 2016 at 1:03 PM, ekarlins notifications@github.com wrote:

When I try to convert an EPIC RGSet to a 450k RGSet using minfi 1.20.0 I get the following error:

rgset450k <- convertArray(RGsetEpic, "IlluminaHumanMethylation450k") [convertArray] Casting as IlluminaHumanMethylation450k Error: all(probes1$ProbeSeqA == probes2$ProbeSeqB) is not TRUE

It looks like the problem is in lines 303 and 304 of https://github.com/ kasperdanielhansen/minfi/blob/master/R/combineArrays.R

The code reads: stopifnot(all(probes1$ProbeSeqA == probes2$ProbeSeqB)) stopifnot(all(probes1$ProbeSeqB == probes2$ProbeSeqA))

So currently it's: A == B B == A

If you change it to: A == A B == B

The code runs as expected. Is this a typo?

Thanks! Eric

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kasperdanielhansen/minfi/issues/85, or mute the thread https://github.com/notifications/unsubscribe-auth/AEuhnx1cX4o7LdGmmhGDJ0G0nM4qhiwdks5q81x2gaJpZM4Ku9gC .

ekarlins commented 7 years ago

Kasper, I think you are right. The error was from running on our cluster. The function works fine running on my computer.

Below is sessionInfo on our cluster. Thanks!

sessionInfo() R version 3.3.0 (2016-05-03) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS release 6.8 (Final)

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base

other attached packages: [1] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.0 [2] CGRmeth_1.0
[3] FlowSorted.Blood.450k_1.12.0
[4] IlluminaHumanMethylationEPICanno.ilmn10.hg19_0.1.1 [5] IlluminaHumanMethylationEPICmanifest_0.1.1
[6] devtools_1.12.0
[7] IlluminaHumanMethylation450kmanifest_0.4.0
[8] minfi_1.20.0
[9] bumphunter_1.14.0
[10] locfit_1.5-9.1
[11] iterators_1.0.8
[12] foreach_1.4.3
[13] Biostrings_2.42.0
[14] XVector_0.14.0
[15] SummarizedExperiment_1.4.0
[16] GenomicRanges_1.26.1
[17] GenomeInfoDb_1.10.1
[18] IRanges_2.8.1
[19] S4Vectors_0.12.0
[20] Biobase_2.34.0
[21] BiocGenerics_0.20.0

loaded via a namespace (and not attached): [1] httr_1.2.1 nor1mix_1.2-2 splines_3.3.0
[4] doRNG_1.6 Rsamtools_1.26.1 RSQLite_1.0.0
[7] lattice_0.20-34 limma_3.30.2 quadprog_1.5-5
[10] chron_2.3-47 digest_0.6.10 RColorBrewer_1.1-2
[13] preprocessCore_1.36.0 Matrix_1.2-7.1 plyr_1.8.4
[16] GEOquery_2.40.0 siggenes_1.48.0 XML_3.98-1.5
[19] biomaRt_2.30.0 genefilter_1.56.0 zlibbioc_1.20.0
[22] xtable_1.8-2 BiocParallel_1.8.1 openssl_0.9.5
[25] annotate_1.52.0 beanplot_1.2 pkgmaker_0.22
[28] withr_1.0.2 GenomicFeatures_1.26.0 survival_2.40-1
[31] magrittr_1.5 mclust_5.2 memoise_1.0.0
[34] nlme_3.1-128 MASS_7.3-45 tools_3.3.0
[37] registry_0.3 data.table_1.9.6 matrixStats_0.51.0
[40] stringr_1.1.0 rngtools_1.2.4 AnnotationDbi_1.36.0
[43] base64_2.0 grid_3.3.0 RCurl_1.95-4.8
[46] bitops_1.0-6 codetools_0.2-15 multtest_2.30.0
[49] DBI_0.5-1 reshape_0.8.6 roxygen2_5.0.1
[52] R6_2.2.0 illuminaio_0.16.0 GenomicAlignments_1.10.0 [55] rtracklayer_1.34.1 stringi_1.1.2 Rcpp_0.12.7

kasperdanielhansen commented 7 years ago

Your annotation packages on your cluster are out of date. Importantly, for reasons I dont understand, you have IlluminaHumanMethylationEPICanno.ilmn10.hg19_0.1.1 loaded. You should remove this package from your system. It points to an old, error filed annotation file released by Illumina. You want annotation(RGsetEPIC) to point to "ilm10b2.hg19" (see the "b2"?).

Best, Kasper

On Thu, Nov 10, 2016 at 2:18 PM, ekarlins notifications@github.com wrote:

Kasper, I think you are right. The error was from running on our cluster. The function works fine running on my computer.

Below is sessionInfo on our cluster. Thanks!

sessionInfo() R version 3.3.0 (2016-05-03) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS release 6.8 (Final)

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C

[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8

[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8

[7] LC_PAPER=en_US.UTF-8 LC_NAME=C

[9] LC_ADDRESS=C LC_TELEPHONE=C

[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base

other attached packages: [1] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.0 [2] CGRmeth_1.0

[3] FlowSorted.Blood.450k_1.12.0

[4] IlluminaHumanMethylationEPICanno.ilmn10.hg19_0.1.1 [5] IlluminaHumanMethylationEPICmanifest_0.1.1

[6] devtools_1.12.0

[7] IlluminaHumanMethylation450kmanifest_0.4.0

[8] minfi_1.20.0

[9] bumphunter_1.14.0

[10] locfit_1.5-9.1

[11] iterators_1.0.8

[12] foreach_1.4.3

[13] Biostrings_2.42.0

[14] XVector_0.14.0

[15] SummarizedExperiment_1.4.0

[16] GenomicRanges_1.26.1

[17] GenomeInfoDb_1.10.1

[18] IRanges_2.8.1

[19] S4Vectors_0.12.0

[20] Biobase_2.34.0

[21] BiocGenerics_0.20.0

loaded via a namespace (and not attached): [1] httr_1.2.1 nor1mix_1.2-2 splines_3.3.0

[4] doRNG_1.6 Rsamtools_1.26.1 RSQLite_1.0.0

[7] lattice_0.20-34 limma_3.30.2 quadprog_1.5-5

[10] chron_2.3-47 digest_0.6.10 RColorBrewer_1.1-2

[13] preprocessCore_1.36.0 Matrix_1.2-7.1 plyr_1.8.4

[16] GEOquery_2.40.0 siggenes_1.48.0 XML_3.98-1.5

[19] biomaRt_2.30.0 genefilter_1.56.0 zlibbioc_1.20.0

[22] xtable_1.8-2 BiocParallel_1.8.1 openssl_0.9.5

[25] annotate_1.52.0 beanplot_1.2 pkgmaker_0.22

[28] withr_1.0.2 GenomicFeatures_1.26.0 survival_2.40-1

[31] magrittr_1.5 mclust_5.2 memoise_1.0.0

[34] nlme_3.1-128 MASS_7.3-45 tools_3.3.0

[37] registry_0.3 data.table_1.9.6 matrixStats_0.51.0

[40] stringr_1.1.0 rngtools_1.2.4 AnnotationDbi_1.36.0

[43] base64_2.0 grid_3.3.0 RCurl_1.95-4.8

[46] bitops_1.0-6 codetools_0.2-15 multtest_2.30.0

[49] DBI_0.5-1 reshape_0.8.6 roxygen2_5.0.1

[52] R6_2.2.0 illuminaio_0.16.0 GenomicAlignments_1.10.0 [55] rtracklayer_1.34.1 stringi_1.1.2 Rcpp_0.12.7

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kasperdanielhansen/minfi/issues/85#issuecomment-259781364, or mute the thread https://github.com/notifications/unsubscribe-auth/AEuhnwf0XNc7l3KqbehriVSNQBLhD3G4ks5q824PgaJpZM4Ku9gC .

ekarlins commented 7 years ago

Kasper, Thanks so much for your help! I've found the old code that was adding the old annotation to RGsetEPIC.

To be clear if others read this, the minfi function "convertArray" works well. The issue was I made my RGSet using an old annotation file which caused the function to break.

Thank you Kasper for updating minfi so quickly to work with EPIC arrays! It's really impressive!!

best, Eric