Bioconductor / BiocParallel

Bioconductor facilities for parallel evaluation
https://bioconductor.org/packages/BiocParallel
67 stars 29 forks source link

BiocParallel first remote error: cannot open the connection #177

Closed Begotalavera closed 2 years ago

Begotalavera commented 2 years ago

Hi, I am using patRoon and therefore Bioconductor packages to analyze my LC-MS data. I am currently getting this error:

fGroups <- groupFeatures(fList, "xcms3", rtalign = TRUE,
                      groupParam = xcms::PeakDensityParam(sampleGroups = 
                      analysisInfo(fList)$group, bw = 10, minFraction = 0.5, 
                      minSamples = 1, binSize = 0.01, maxFeatures = 50), 
                      retAlignParam = xcms::ObiwarpParam(binSize = 1,
                      centerSample = 10, response = 100, distFun = "cor_opt",
                      gapInit = 0.3, gapExtend = 2.4, factorDiag = 2, 
                      factorGap = 1, localAlignment = FALSE, 
                      initPenalty = 0))
Grouping features with XCMS...

Performing retention time alignment...
Sample number 10 used as center sample.
Aligning QC_14_POS.mzML against 908_POS.mzML ... OK
Aligning QC_15_POS.mzML against 908_POS.mzML ... OK
Aligning QC_16_POS.mzML against 908_POS.mzML ... 
Error: BiocParallel errors
  3 remote errors, element index: 4, 9, 12
  8 unevaluated and other errors
  first remote error: cannot open the connection

Any ideas??

Thank you!!

Begotalavera commented 2 years ago

sessionInfo()

R version 4.1.2 (2021-11-01) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
system code page: 65001

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] xcms_3.16.1 MSnbase_2.20.1 ProtGenerics_1.26.0 S4Vectors_0.32.3 mzR_2.28.0 Biobase_2.54.0 BiocGenerics_0.40.0 BiocParallel_1.28.2 [9] patRoon_1.2.0 Rcpp_1.0.7

loaded via a namespace (and not attached): [1] bitops_1.0-7 matrixStats_0.61.0 bit64_4.0.5 doParallel_1.0.16 RColorBrewer_1.1-2
[6] httr_1.4.2 GenomeInfoDb_1.30.0 backports_1.4.0 tools_4.1.2 utf8_1.2.2
[11] R6_2.5.1 affyio_1.64.0 DBI_1.1.1 colorspace_2.0-2 dat_0.5.0
[16] withr_2.4.3 tidyselect_1.1.1 bit_4.0.4 compiler_4.1.2 MassSpecWavelet_1.60.0
[21] preprocessCore_1.56.0 graph_1.72.0 DelayedArray_0.20.0 checkmate_2.0.0 scales_1.1.1
[26] DEoptimR_1.0-9 robustbase_0.93-9 affy_1.72.0 digest_0.6.29 XVector_0.34.0
[31] htmltools_0.5.2 pkgconfig_2.0.3 fst_0.9.4 MatrixGenerics_1.6.0 fastmap_1.1.0
[36] itertools_0.1-3 limma_3.50.0 rlang_0.4.12 RSQLite_2.2.9 impute_1.68.0
[41] shiny_1.7.1 generics_0.1.1 mzID_1.32.0 aoos_0.5.0 dplyr_1.0.7
[46] RCurl_1.98-1.5 magrittr_2.0.1 GenomeInfoDbData_1.2.7 Formula_1.2-4 MALDIquant_1.20
[51] Matrix_1.3-4 munsell_0.5.0 fansi_0.5.0 MsCoreUtils_1.6.0 logger_0.2.2
[56] lifecycle_1.0.1 vsn_3.62.0 yaml_2.2.1 MASS_7.3-54 SummarizedExperiment_1.24.0 [61] zlibbioc_1.40.0 plyr_1.8.6 blob_1.2.2 grid_4.1.2 promises_1.2.0.1
[66] parallel_4.1.2 crayon_1.4.2 lattice_0.20-45 rcdklibs_2.3 MsFeatures_1.2.0
[71] pillar_1.6.4 GenomicRanges_1.46.1 rjson_0.2.20 codetools_0.2-18 rcdk_3.6.0
[76] XML_3.99-0.8 glue_1.5.1 pcaMethods_1.86.0 data.table_1.14.2 BiocManager_1.30.16
[81] Rdpack_2.1.3 httpuv_1.6.3 png_0.1-7 vctrs_0.3.8 RMassBank_3.4.0
[86] foreach_1.5.1 gtable_0.3.0 RANN_2.6.1 purrr_0.3.4 clue_0.3-60
[91] assertthat_0.2.1 cachem_1.0.6 ggplot2_3.3.5 rbibutils_2.2.5 mime_0.12
[96] xtable_1.8-4 later_1.3.0 ncdf4_1.18 snow_0.4-4 tibble_3.1.6
[101] rJava_1.0-5 iterators_1.0.13 fingerprint_3.5.7 memoise_2.0.1 IRanges_2.28.0
[106] cluster_2.1.2 ellipsis_0.3.2

Jiefei-Wang commented 2 years ago

Hi, This looks like a missing file issue. Please submit it to patRoon as we can only solve the problem of BiocParallel itself.

Best, Jiefei

mtmorgan commented 2 years ago

The developers might need to register(SnowParam()) or use a SnowParam() as default, to reproduce this error on non-windows.

Begotalavera commented 2 years ago

Hi, This looks like a missing file issue. Please submit it to patRoon as we can only solve the problem of BiocParallel itself.

Best, Jiefei

Thank you for the answer, the issue is alredy in "patRoon"

MuyaoXi9271 commented 2 years ago

Hi all experts,

I also meet the same error sometimes when I extract MS2 data by using "featureSpectra". Sometimes the error does not come out when I restart R or reboot the computer, but now this error keeps coming no matter whether I reboot the computer or restart R studio. I also try the solution of "register(SnowParam())". I have in total 20 samples.

Error: BiocParallel errors 2 remote errors, element index: 2, 6 3 unevaluated and other errors first remote error: cannot open the connection In addition: Warning messages: 1: In serialize(data, node$con) : 'package:stats' may not be available when loading 2: In serialize(data, node$con) : 'package:stats' may not be available when loading 3: In serialize(data, node$con) : 'package:stats' may not be available when loading 4: In serialize(data, node$con) : 'package:stats' may not be available when loading 5: In serialize(data, node$con) : 'package:stats' may not be available when loading 6: In serialize(data, node$con) : 'package:stats' may not be available when loading 7: stop worker failed: wrong args for environment subassignment

Bests, Muyao

Begotalavera commented 2 years ago

Hi, I tried with parallelization disabled and I didn't get the error anymore.

Maybe you could try..

BiocParallel::register(BiocParallel::SerialParam(), default = TRUE)

Best Bego

MuyaoXi9271 commented 2 years ago

Hi Bego,

Thanks, it works. But I seemly like get the wrong result. image

The second one is obtained from the last one by the normal setting and the first one is obtained from the third one by using the code you suggested to me. The result reduces 10 times lower.

Bests, Muyao

mtmorgan commented 2 years ago

This really sounds like a bug in the package you are using -- the key is for whatever the 'connection' is to be opened on the worker, rather than passed from the manager to the worker. Have you opened an issue with whatever package featureSpectra() / groupFeatures() are defined in, perhaps referencing this issue?

MuyaoXi9271 commented 2 years ago

I will do that now. Thanks for taking care of this issue. By the way, may I ask a very basic question? Why did you design to pass from the manager to the worker instead of to the worker directly?

Bests, Muyao

mtmorgan commented 2 years ago

I don't know anything about the implementation of the package you are using; I am guessing that it is a consequence of the package author's decisions that you see this error. The difference is roughly

## WRONG! connection opened on the master, passed to the worker
con = file(fl); open(con)
result1 = bplapply(1:10, function(i, con) readLines(con, n = 1), con)
## Better! file passed to worker & opened there
result2 = bplapply(1:10, function(i, fl) { con = file(fl); open(con); readLines(con, n = 1) }, fl)