gustaveroussy / EaCoN

Easy Copy Number !
MIT License
20 stars 14 forks source link

R 4.0.3 error in CS.Process #23

Closed ChristopherEeles closed 3 years ago

ChristopherEeles commented 3 years ago

Hello package maintainers,

I have a persistent error in CS.Process. I tried to debug it in RStudio, but I actually get a different error in oschp.load stating the the HDF5 header is not valid in the call to rhdf5::h5read. I have tried running the function on different .CEL files in case it was a data corruption issue, but get the same error each time.

> CS.Process(list.files('rawdata', '.*CEL', full.names=TRUE)[1], samplename='LMS90')
 [bhklab:664503] Decompressing rawdata/LMS_90-422_92-543.CEL ...
 [bhklab:664503] OS is reported as linux
 [bhklab:664503] Running APT ...

 [bhklab:664503] Renaming OSCHP ...
 [bhklab:664503] Removing temporary files ...
 [bhklab:664503] Done.
 [bhklab:664503] Loading BSgenome.Hsapiens.UCSC.hg19 ...
 [bhklab:664503] Normalizing SNP data (using rcnorm) ...
 [bhklab:664503] Reading CEL ...
 [bhklab:664503] Loading annotations ...
 [bhklab:664503] Loading chromosomal information ...
 [bhklab:664503] Loading BSgenome.Hsapiens.UCSC.hg19 ...
 [bhklab:664503] Building data structure ...
 [bhklab:664503] Computing raw BAF ...
 [bhklab:664503] Normalizing BAF ...
 [bhklab:664503] Wave re-normalization ...
 [bhklab:664503] Init (100.748933581353)
 [bhklab:664503]  Positive fit with GSE54504 (99.5298298938364)
 [bhklab:664503]  Positive fit with GSE53799 (99.4230744279665)
 [bhklab:664503] GC renormalization ...
 [bhklab:664503] Init (99.4138290154934)
 [bhklab:664503]  Positive fit with GC6400 (99.3833250417165)
 [bhklab:664503]  Positive fit with GC200 (99.1533512501324)
 [bhklab:664503]  Positive fit with GC50 (98.9689844776831)
Error in ao.df$germ[ao.df$germ %in% c(8, 11)] <- 0 : 
  incompatible types (from double to raw) in subassignment type fix
In addition: Warning messages:
1:   Using providerVersion() on a BSgenome object is deprecated. Please use
  'metadata(x)$genome' instead. 
2:   Using providerVersion() on a BSgenome object is deprecated. Please use
  'metadata(x)$genome' instead. 
3: `as.tbl()` is deprecated as of dplyr 1.0.0.
Please use `tibble::as_tibble()` instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated. 
4:   Using providerVersion() on a BSgenome object is deprecated. Please use
  'metadata(x)$genome' instead. 
5: In log(rcmat[, 1] + 1/6) : NaNs produced
6: In log(rcmat[, 2] + 1/6) : NaNs produced

sessionInfo:

> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04.1 LTS

Matrix products: default
BLAS/LAPACK: /home/bioinf/miniconda3/envs/affymetrix_cnv/lib/libopenblasp-r0.3.12.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices datasets  utils    
[8] methods   base     

other attached packages:
 [1] BSgenome.Hsapiens.UCSC.hg19_1.4.3 BSgenome_1.58.0                  
 [3] rtracklayer_1.50.0                Biostrings_2.58.0                
 [5] XVector_0.30.0                    GenomicRanges_1.42.0             
 [7] GenomeInfoDb_1.26.2               IRanges_2.24.0                   
 [9] S4Vectors_0.28.0                  BiocGenerics_0.36.0              
[11] CytoScanHD.Array.na33.r4_0.1.0    apt.cytoscan.2.4.0_0.1.6         
[13] EaCoN_0.3.5                      

loaded via a namespace (and not attached):
 [1] mclust_5.4.7                ASCAT_2.5.2                
 [3] lattice_0.20-41             prettyunits_1.1.1          
 [5] Rsamtools_2.6.0             zoo_1.8-8                  
 [7] digest_0.6.27               foreach_1.5.1              
 [9] R6_2.5.0                    aroma.light_3.20.0         
[11] pillar_1.4.7                progress_1.2.2             
[13] zlibbioc_1.36.0             rlang_0.4.9                
[15] data.table_1.13.4           R.utils_2.10.1             
[17] R.oo_1.24.0                 Matrix_1.2-18              
[19] DT_0.16                     BiocParallel_1.24.1        
[21] htmlwidgets_1.5.2           RCurl_1.98-1.2             
[23] DelayedArray_0.16.0         compiler_4.0.3             
[25] pkgconfig_2.0.3             htmltools_0.5.0            
[27] tidyselect_1.1.0            SummarizedExperiment_1.20.0
[29] tibble_3.0.4                GenomeInfoDbData_1.2.4     
[31] codetools_0.2-18            matrixStats_0.57.0         
[33] XML_3.99-0.5                changepoint_2.2.2          
[35] crayon_1.3.4                dplyr_1.0.2                
[37] MASS_7.3-53                 GenomicAlignments_1.26.0   
[39] bitops_1.0-6                rhdf5filters_1.2.0         
[41] R.methodsS3_1.8.1           grid_4.0.3                 
[43] lifecycle_0.2.0             magrittr_2.0.1             
[45] renv_0.12.3                 doParallel_1.0.16          
[47] rcnorm_0.1.5                limma_3.46.0               
[49] seqinr_4.2-4                ellipsis_0.3.1             
[51] generics_0.1.0              vctrs_0.3.5                
[53] Rhdf5lib_1.12.0             RColorBrewer_1.1-2         
[55] iterators_1.0.13            tools_4.0.3                
[57] iotools_0.3-1               ade4_1.7-16                
[59] Biobase_2.50.0              glue_1.4.2                 
[61] purrr_0.3.4                 hms_0.5.3                  
[63] MatrixGenerics_1.2.0        rhdf5_2.34.0               
[65] BiocManager_1.30.10         affxparser_1.62.0   

I have also attached the log file from the function run. I am going to try again with R 3.6 and see if the error persists.

Best, Christopher Eeles Software Developer Benjamin Haibe-Kains Lab Princess Margaret Cancer Centre EaCoN_CS.Process_log_LMS90.txt

ChristopherEeles commented 3 years ago

I have installed the package in R 3.6.3 and the error goes away. Seems like the package needs to be updated to work with R >= 4.0

aoumess commented 3 years ago

Hi Christopher ! Thanks a lot for your investigation. I'm short of time for the coming days, but I'll do my best to make EaCoN compliant with R4 during the winter holidays.

Cheers

ShenWei-wei commented 3 years ago

Hi

I have the same problem on Affymetrix OncoScan / OncoScan_CNV data use OS.Process with R 4.0.3, and I change the original script apt_oncoscan_process.R

  #germ[germ %in% c(8,11)] <- 0
  #germ[germ !=0 ] <- 1
  germ[germ %in% as.raw(c(8, 11))] <- as.raw(0)
  germ[germ !=0 ] <- as.raw(1)

and reinstall EaCoN and its working without error, but I don't know the mean of c(8,11),although the program works, I don't know if this is the idea of changing the original program, so did I revise it properly?

aoumess commented 3 years ago

Hi,

Thanks for your workaround. I'm still a bit short of time right now (and actually just very recently switched from R3 to R4), but in the next 10 days I'll correct this problem thanks to the help of both of you.

Kind regards

aoumess commented 3 years ago

Hi,

I added SehnWei-wei's suggestion as a patch in release https://github.com/gustaveroussy/EaCoN/releases/tag/0.3.6 Thanks again :)

Shamik23 commented 3 years ago

Hi, @aoumess the latest fix for R 4.0 is not functioning in R 3.6. I had to install the previous version to get CS.Process working. Also, the link for the wave normalization file for hg38 is not working. Could you please re-upload the file? Thanks in advance!

aoumess commented 3 years ago

Hi, @aoumess the latest fix for R 4.0 is not functioning in R 3.6. I had to install the previous version to get CS.Process working. Also, the link for the wave normalization file for hg38 is not working. Could you please re-upload the file? Thanks in advance!

Hi ! Thanks for bringing this problem up. I pushed a fix that evaluate the R major version before applying @ShenWei-wei 's fix. I hope this should do the trick ? Could you confirm, please ? Just use the latest master version (not the latest release tag). As a sidenote, I'm currently checking and renewing the broken dependencies links. Cheers

ChristopherEeles commented 3 years ago

Hi @aoumess,

Tried to update my current Snakemake pipeline to use R 4.0.5 today, but the installation of the Cytoscan HD design failed, claiming it is unavailable for my R version.

I tried to install using:

install.packages("https://nextcloud.gustaveroussy.fr/s/FHRnT99A2kLJk6p/download", repos = NULL, type = "source")

Thanks for your assistance.

Best, Chris

aoumess commented 3 years ago

Hi,

All non-canonical (ie, not on CRAN, Bioconductor, nor github) have been moved to Zenodo.org, which should be reliable enough !