hansenlab / bsseq

Devel repository for bsseq
36 stars 26 forks source link

BSmooth with HDF5 realization backend error: Stop worker failed with the error: wrong args for environment subassignment #118

Open Apompetti-Cori opened 1 year ago

Apompetti-Cori commented 1 year ago

Hello,

I'm trying to smooth a bsseq object that I contructed with a HDF5 realization backend and I am getting the error "Stop worker failed with the error: wrong args for environment subassignment."

The code I ran to smooth the bsseq object is: bsseq_obj_smooth <- bsseq::BSmooth(BSseq = bsseq_obj, BPPARAM = BiocParallel::MulticoreParam(workers = 16, progressbar = TRUE), verbose = TRUE)

I read that there are some issues with realization and parallelization combinations but this didn't seem to be an issue as far as I know.

PeteHaitch commented 1 year ago

Hi,

I'm on leave until April 26 and will not be dealing with GitHub issues during this time. @kasperdanielhansen may be able to help in the meantime.

Please provide a reproducible example and sessionInfo() to help us help you.

Thanks, Pete

Apompetti-Cori commented 1 year ago
R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.3.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] rhdf5_2.42.0                here_1.0.1                 
 [3] plotly_4.10.1               PCAtools_2.10.0            
 [5] ggrepel_0.9.3               lubridate_1.9.2            
 [7] forcats_1.0.0               stringr_1.5.0              
 [9] dplyr_1.1.1                 purrr_1.0.1                
[11] readr_2.1.4                 tidyr_1.3.0                
[13] tibble_3.2.1                ggplot2_3.4.1              
[15] tidyverse_2.0.0             MethylResolver_0.1.0       
[17] methylSig_1.10.0            bsseq_1.34.0               
[19] SummarizedExperiment_1.28.0 Biobase_2.58.0             
[21] MatrixGenerics_1.10.0       matrixStats_0.63.0         
[23] GenomicRanges_1.50.2        GenomeInfoDb_1.34.9        
[25] IRanges_2.32.0              S4Vectors_0.36.2           
[27] BiocGenerics_0.44.0        

loaded via a namespace (and not attached):
 [1] bitops_1.0-7              DSS_2.46.0               
 [3] httr_1.4.5                doParallel_1.0.17        
 [5] rprojroot_2.0.3           tools_4.2.1              
 [7] job_0.3.0                 irlba_2.3.5.1            
 [9] utf8_1.2.3                R6_2.5.1                 
[11] HDF5Array_1.26.0          lazyeval_0.2.2           
[13] colorspace_2.1-0          permute_0.9-7            
[15] rhdf5filters_1.10.0       withr_2.5.0              
[17] tidyselect_1.2.0          compiler_4.2.1           
[19] cli_3.6.1                 DelayedArray_0.24.0      
[21] rtracklayer_1.58.0        scales_1.2.1             
[23] DEoptimR_1.0-11           robustbase_0.95-1        
[25] randomForest_4.7-1.1      digest_0.6.31            
[27] Rsamtools_2.14.0          R.utils_2.12.2           
[29] XVector_0.38.0            htmltools_0.5.5          
[31] pkgconfig_2.0.3           sparseMatrixStats_1.10.0 
[33] fastmap_1.1.1             limma_3.54.1             
[35] BSgenome_1.66.3           htmlwidgets_1.6.2        
[37] rlang_1.1.0               rstudioapi_0.14          
[39] DelayedMatrixStats_1.20.0 BiocIO_1.8.0             
[41] generics_0.1.3            jsonlite_1.8.4           
[43] BiocParallel_1.32.5       gtools_3.9.4             
[45] R.oo_1.25.0               BiocSingular_1.14.0      
[47] RCurl_1.98-1.12           magrittr_2.0.3           
[49] GenomeInfoDbData_1.2.9    Matrix_1.5-3             
[51] Rcpp_1.0.10               munsell_0.5.0            
[53] Rhdf5lib_1.20.0           fansi_1.0.4              
[55] lifecycle_1.0.3           R.methodsS3_1.8.2        
[57] stringi_1.7.12            Metrics_0.1.4            
[59] yaml_2.3.7                zlibbioc_1.44.0          
[61] plyr_1.8.8                grid_4.2.1               
[63] dqrng_0.3.0               parallel_4.2.1           
[65] crayon_1.5.2              doSNOW_1.0.20            
[67] lattice_0.20-45           beachmat_2.14.0          
[69] cowplot_1.1.1             Biostrings_2.66.0        
[71] splines_4.2.1             hms_1.1.3                
[73] locfit_1.5-9.7            pillar_1.9.0             
[75] varhandle_2.0.5           rjson_0.2.21             
[77] reshape2_1.4.4            ScaledMatrix_1.6.0       
[79] codetools_0.2-19          XML_3.99-0.14            
[81] glue_1.6.2                data.table_1.14.8        
[83] vctrs_0.6.1               tzdb_0.3.0               
[85] foreach_1.5.2             gtable_0.3.3             
[87] rsvd_1.0.5                restfulr_0.0.15          
[89] viridisLite_0.4.1         snow_0.4-4               
[91] iterators_1.0.14          GenomicAlignments_1.34.0 
[93] timechange_0.2.0        
PeteHaitch commented 1 year ago

And a reproducible example?

Apompetti-Cori commented 1 year ago

Hmm I guess that'd be tough without sending you my cov files. I did end up getting it to work using SerialParam. Should I figure out a way to send my files if possible?

PeteHaitch commented 1 year ago

If possible, that'd be helpful. You'd only need to share the BSseq object (or a subset of it that is sufficient to reproduce the issue) rather than the .cov files themselves.

How many cores does your machine have? Can you try running it with BiocParallel::MulticoreParam(workers = k, progressbar = TRUE) where you vary k, e.g., k=1, k=2, k=4, k=8, k=16 to see if it works for smaller k.