hansenlab / minfi

Devel repository for minfi
58 stars 67 forks source link

Error in `preprocessFunnorm` #201

Open kdkorthauer opened 4 years ago

kdkorthauer commented 4 years ago

I am getting the following error in preprocessFunnorm:

Error in preprocessCore::normalize.quantiles.use.target(matrix(crtColumn.reduced), : Supplied target argument should be a numeric vector

I found an issue that may be related from this thread here: https://groups.google.com/forum/#!msg/epigenomicsforum/7cALK2Ueyg0/V6O_3FGDDgAJ

Additional information:

  1. I'm running on two groups of samples that I expect will have global methylation differences.
  2. If I run preprocessFunnorm on each group separately, I do not get the error
  3. If I run preprocessQuantile on the the combined sample set, I do not get an error
  4. I get the same result whether I input the known sex information or leave as NULL.
  5. I get the same result using release and devel versions of minfi
  6. The specific idat files I'm using are: (1) GEO set GSE67393, and (2) TCGA-KIRC Primary Tumor samples obtained via TCGAbiolinks package. Please let me know if you'd like more info and/or code I'm using to fetch the files.

Thank you!

kasperdanielhansen commented 4 years ago

My guess is that there are NAs in the target argument to normalize.quantiles. Why, I don't know. The best way is probably to share the IDATs with us, perhaps as a Dropbox folder. Or as a very easy to run script.

kdkorthauer commented 4 years ago

Thanks, Kasper. Here are the IDAT files:

https://www.dropbox.com/sh/znw9fq5sc5narb6/AAA2eQOcLtkZc5F06awE8NQGa?dl=0

I haven't tried a smaller subset of IDATs to see if it still produces the error, but I'll investigate that today. Let me know if there's anything else I can provide that would be helpful.

kasperdanielhansen commented 4 years ago

Did you manage to make a smaller example? I have gotten a lot of timeouts on this dropbox link and unfortunately I have to restart the download every time.

In the meantime, you can do 2 things 1) Run preprocessFunnorm with verbose=10 and tell me if it stops during autosomal or sex chromosome processing. 2) Run preprocessFunnorm with options(error = browser). This catapults you into a browser session when R encounters and error, and will allow you to inspect the values of the target argument to normalize.quantiles. I am suspecting that you either have NaN or NA but it would be useful to know which whether it is for autosomes or which sex chromosome.

kdkorthauer commented 4 years ago

Sorry for the dropbox trouble. I did try to make a smaller example, but (unfortunately) was successful on running preprocessFunnorm on smaller subsets.

Another odd success I came across is that when I do not add a DataFrame to the pData slot, preprocessFunnorm completes without error on the full set. So could the error be in the way I'm adding phenotype data to the RGChannelSet object?

Here is the output on the full set (with pData added) with verbose=10:

> grSet <- preprocessFunnorm(rgSet, verbose=10)
[preprocessFunnorm] Background and dye bias correction with noob
[preprocessFunnorm] Mapping to genome
[preprocessFunnorm] Quantile extraction
[preprocessFunnorm] Normalization
[normalizeFunnorm450k] Normalization of the IGrn probes
[normalizeFunnorm450k] Normalization of the IRed probes
[normalizeFunnorm450k] Normalization of the II probes
[normalizeFunnorm450k] Normalization of the X-chromosome
Error in preprocessCore::normalize.quantiles.use.target(matrix(crtColumn.reduced),  : 
  Supplied target argument should be a numeric vector
In addition: Warning message:
In .getSex(CN = CN, xIndex = xIndex, yIndex = yIndex, cutoff = cutoff) :
  An inconsistency was encountered while determining sex. One possibility is that only one sex is present. We recommend further checks, for example with the plotSex function.
alvaannett commented 1 year ago

Hi! Since this is still an open issue I'll comment here. I'm getting this error as well when running preprocessFunnorm(rgSet, verbose=10), both with or without sex specified. As mentioned above preprocessFunnorm() runs without error if I don't have anything in the pData slot. Not a huge issue to run it without metadata but kind of strange?

> mSetSq = preprocessFunnorm(rgSet, verbose=10)
[preprocessFunnorm] Background and dye bias correction with noob
[preprocessFunnorm] Mapping to genome
[preprocessFunnorm] Quantile extraction
[preprocessFunnorm] Normalization
[normalizeFunnorm450k] Normalization of the IGrn probes
[normalizeFunnorm450k] Normalization of the IRed probes
[normalizeFunnorm450k] Normalization of the II probes
[normalizeFunnorm450k] Normalization of the X-chromosome
Error in preprocessCore::normalize.quantiles.use.target(matrix(crtColumn.reduced),  : 
  Supplied target argument should be a numeric vector
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Rocky Linux 8.5 (Green Obsidian)

Matrix products: default
BLAS/LAPACK: /cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/lib64/libflexiblas.so.3.0

locale:
 [1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C               LC_TIME=en_CA.UTF-8        LC_COLLATE=en_CA.UTF-8     LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_CA.UTF-8   
 [7] LC_PAPER=en_CA.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.0 IlluminaHumanMethylation450kmanifest_0.4.0         doParallel_1.0.17                                 
 [4] limma_3.50.3                                       RColorBrewer_1.1-3                                 tibble_3.1.7                                      
 [7] dplyr_1.0.9                                        data.table_1.14.2                                  sva_3.40.0                                        
[10] BiocParallel_1.28.3                                genefilter_1.76.0                                  mgcv_1.8-35                                       
[13] nlme_3.1-152                                       minfi_1.38.0                                       bumphunter_1.34.0                                 
[16] locfit_1.5-9.5                                     iterators_1.0.14                                   foreach_1.5.2                                     
[19] Biostrings_2.62.0                                  XVector_0.34.0                                     SummarizedExperiment_1.24.0                       
[22] Biobase_2.54.0                                     MatrixGenerics_1.6.0                               matrixStats_0.62.0                                
[25] GenomicRanges_1.46.1                               GenomeInfoDb_1.30.1                                IRanges_2.28.0                                    
[28] S4Vectors_0.32.4                                   BiocGenerics_0.40.0                               

loaded via a namespace (and not attached):
  [1] rjson_0.2.21              ellipsis_0.3.2            siggenes_1.66.0           rprojroot_2.0.3           mclust_5.4.10             base64_2.0               
  [7] rstudioapi_0.13           bit64_4.0.5               AnnotationDbi_1.56.2      fansi_1.0.3               xml2_1.3.3                codetools_0.2-18         
 [13] splines_4.1.2             sparseMatrixStats_1.4.0   cachem_1.0.6              scrime_1.3.5              knitr_1.39                pkgload_1.2.4            
 [19] Rsamtools_2.10.0          annotate_1.72.0           dbplyr_2.2.1              png_0.1-7                 HDF5Array_1.20.0          readr_2.1.2              
 [25] compiler_4.1.2            httr_1.4.3                assertthat_0.2.1          Matrix_1.3-3              fastmap_1.1.0             cli_3.3.0                
 [31] htmltools_0.5.2           prettyunits_1.1.1         tools_4.1.2               glue_1.6.2                GenomeInfoDbData_1.2.7    rappdirs_0.3.3           
 [37] doRNG_1.8.2               Rcpp_1.0.8.3              vctrs_0.4.1               rhdf5filters_1.6.0        multtest_2.48.0           preprocessCore_1.54.0    
 [43] rtracklayer_1.54.0        DelayedMatrixStats_1.14.0 xfun_0.31                 stringr_1.4.0             brio_1.1.3                testthat_3.1.4           
 [49] lifecycle_1.0.1           restfulr_0.0.15           rngtools_1.5              XML_3.99-0.10             beanplot_1.3.1            edgeR_3.36.0             
 [55] zlibbioc_1.40.0           MASS_7.3-54               hms_1.1.1                 rhdf5_2.38.1              GEOquery_2.60.0           yaml_2.3.5               
 [61] curl_4.3.2                memoise_2.0.1             biomaRt_2.48.0            reshape_0.8.9             stringi_1.7.6             RSQLite_2.2.14           
 [67] desc_1.4.1                BiocIO_1.4.0              GenomicFeatures_1.44.0    filelock_1.0.2            rlang_1.0.4               pkgconfig_2.0.3          
 [73] bitops_1.0-7              nor1mix_1.3-0             evaluate_0.15             lattice_0.20-44           purrr_0.3.4               Rhdf5lib_1.16.0          
 [79] GenomicAlignments_1.30.0  bit_4.0.4                 tidyselect_1.1.2          plyr_1.8.7                magrittr_2.0.3            R6_2.5.1                 
 [85] generics_0.1.3            DelayedArray_0.20.0       DBI_1.1.3                 withr_2.5.0               pillar_1.7.0              survival_3.2-11          
 [91] KEGGREST_1.34.0           RCurl_1.98-1.7            crayon_1.5.1              utf8_1.2.2                BiocFileCache_2.0.0       tzdb_0.3.0               
 [97] rmarkdown_2.14            progress_1.2.2            grid_4.1.2                blob_1.2.3                digest_0.6.29             xtable_1.8-4             
[103] tidyr_1.2.0               illuminaio_0.34.0         openssl_2.0.2             askpass_1.1               quadprog_1.5-8
antgiord commented 1 year ago

I am experiencing the same error when usi minfi 1.44.0 on R base 4.2.2. Does anyone have any clue about this error?

xyhua2000 commented 7 months ago

Hi! I am experiencing the same error when usi minfi 1.48.0 on R base 4.3.2. If only four samples are to be normalized, the operation will run successfully. > preprocessFunnorm(RGset[,1:4]) [preprocessFunnorm] Background and dye bias correction with noob [preprocessFunnorm] Mapping to genome [preprocessFunnorm] Quantile extraction [preprocessFunnorm] Normalization class: GenomicRatioSet dim: 485512 4 metadata(0): assays(2): Beta CN rownames(485512): cg13869341 cg14008030 ... cg08265308 cg14273923 rowData names(0): colnames: NULL colData names(3): xMed yMed predictedSex Annotation array: IlluminaHumanMethylation450k annotation: ilmn12.hg19 Preprocessing Method: NA minfi version: NA Manifest version: NA Warning message: In .getSex(CN = CN, xIndex = xIndex, yIndex = yIndex, cutoff = cutoff) : An inconsistency was encountered while determining sex. One possibility is that only one sex is present. We recommend further checks, for example with the plotSex function.

If the number of samples is greater than 4, an error will occur. > preprocessFunnorm(RGset[,1:5]) [preprocessFunnorm] Background and dye bias correction with noob [preprocessFunnorm] Mapping to genome [preprocessFunnorm] Quantile extraction [preprocessFunnorm] Normalization Error in preprocessCore::normalize.quantiles.use.target(matrix(crtColumn.reduced), : Supplied target argument should be a numeric vector In addition: Warning message: In .getSex(CN = CN, xIndex = xIndex, yIndex = yIndex, cutoff = cutoff) : An inconsistency was encountered while determining sex. One possibility is that only one sex is present. We recommend further checks, for example with the plotSex function.

`> sessionInfo() R version 4.3.2 (2023-10-31) Platform: x86_64-conda-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS/LAPACK: /home/xyhua/miniforge3/envs/R4.3.2/lib/libopenblasp-r0.3.21.so; LAPACK version 3.9.0

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

time zone: Asia/Shanghai tzcode source: system (glibc)

attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] methyPre_0.1.0 dplyr_1.1.3 data.table_1.14.8
[4] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.1 IlluminaHumanMethylation450kmanifest_0.4.0 minfi_1.48.0
[7] bumphunter_1.44.0 locfit_1.5-9.8 iterators_1.0.14
[10] foreach_1.5.2 Biostrings_2.70.1 XVector_0.42.0
[13] SummarizedExperiment_1.32.0 Biobase_2.62.0 MatrixGenerics_1.14.0
[16] matrixStats_1.1.0 GenomicRanges_1.54.1 GenomeInfoDb_1.38.1
[19] IRanges_2.36.0 S4Vectors_0.40.1 BiocGenerics_0.48.1

loaded via a namespace (and not attached): [1] RColorBrewer_1.1-3 rstudioapi_0.15.0 magrittr_2.0.3 GenomicFeatures_1.54.1 BiocIO_1.12.0 zlibbioc_1.48.0
[7] vctrs_0.6.4 multtest_2.58.0 memoise_2.0.1 Rsamtools_2.18.0 DelayedMatrixStats_1.24.0 RCurl_1.98-1.13
[13] askpass_1.2.0 S4Arrays_1.2.0 progress_1.2.2 curl_5.1.0 Rhdf5lib_1.24.0 SparseArray_1.2.2
[19] rhdf5_2.46.0 RPMM_1.25 nor1mix_1.3-2 desc_1.4.2 plyr_1.8.9 cachem_1.0.8
[25] GenomicAlignments_1.38.0 lifecycle_1.0.4 pkgconfig_2.0.3 Matrix_1.6-3 R6_2.5.1 fastmap_1.1.1
[31] GenomeInfoDbData_1.2.11 digest_0.6.33 siggenes_1.76.0 reshape_0.8.9 ps_1.7.5 AnnotationDbi_1.64.1
[37] rprojroot_2.0.4 pkgload_1.3.3 RSQLite_2.3.3 base64_2.0.1 filelock_1.0.2 fansi_1.0.5
[43] httr_1.4.7 abind_1.4-5 compiler_4.3.2 beanplot_1.3.1 remotes_2.4.2.1 rngtools_1.5.2
[49] bit64_4.0.5 withr_2.5.2 BiocParallel_1.36.0 DBI_1.1.3 pkgbuild_1.4.2 HDF5Array_1.30.0
[55] biomaRt_2.58.0 MASS_7.3-60 openssl_2.1.1 rappdirs_0.3.3 DelayedArray_0.28.0 rjson_0.2.21
[61] tools_4.3.2 glue_1.6.2 quadprog_1.5-8 callr_3.7.3 restfulr_0.0.15 nlme_3.1-163
[67] rhdf5filters_1.14.1 grid_4.3.2 cluster_2.1.4 generics_0.1.3 tzdb_0.4.0 preprocessCore_1.64.0
[73] tidyr_1.3.0 hms_1.1.3 xml2_1.3.5 utf8_1.2.4 pillar_1.9.0 stringr_1.5.1
[79] limma_3.58.1 genefilter_1.84.0 splines_4.3.2 BiocFileCache_2.10.1 lattice_0.22-5 survival_3.5-7
[85] rtracklayer_1.62.0 bit_4.0.5 GEOquery_2.70.0 annotate_1.80.0 tidyselect_1.2.0 scrime_1.3.5
[91] statmod_1.5.0 stringi_1.8.1 yaml_2.3.7 codetools_0.2-19 tibble_3.2.1 cli_3.6.1
[97] xtable_1.8-4 processx_3.8.2 Rcpp_1.0.11 dbplyr_2.4.0 png_0.1-8 XML_3.99-0.15
[103] readr_2.1.4 blob_1.2.4 prettyunits_1.2.0 mclust_6.0.1 doRNG_1.8.6 sparseMatrixStats_1.14.0 [109] bitops_1.0-7 illuminaio_0.44.0 purrr_1.0.2 crayon_1.5.2 rlang_1.1.2 KEGGREST_1.42.0 `

TeoSakel commented 2 months ago

I was also experiencing the same error in minfi_1.49.1 with R version 4.3.2. In my case, the error was caused because the difference between some of the newQuantiles in .normalizeMatrix was below machine precision. Adding check.attributes = FALSE to the equality test fixed the issue in my case.