futureverse / future

:rocket: R package: future: Unified Parallel and Distributed Processing in R for Everyone
https://future.futureverse.org
954 stars 85 forks source link

‘error reading from connection’. Post-mortem diagnostic: No process exists with this PID, i.e. the localhost worker is no longer alive. #704

Closed anjelinejeline closed 11 months ago

anjelinejeline commented 11 months ago

Hello, I have trouble parallelizing my code, This is the error I got … Error in unserialize(node$con) :
MultisessionFuture (doFuture2-1) failed to receive message results from cluster RichSOCKnode #1 (PID 161140 on localhost ‘localhost’). The reason reported was ‘error reading from connection’. Post-mortem diagnostic: No process exists with this PID, i.e. the localhost worker is no longer alive. … It seems that it has already occurred before see https://github.com/HenrikBengtsson/future/issues/474 I tried both multicore and multisession but none of them works

Can you help me to sort this out?

Thank you Angela

My session info

sessionInfo() R version 4.3.2 (2023-10-31) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 22.04.3 LTS

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_IE.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_IE.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_IE.UTF-8 LC_IDENTIFICATION=C

time zone: Europe/Rome tzcode source: system (glibc)

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] progressr_0.14.0 doFuture_1.0.0 future_1.33.0 foreach_1.5.2 glue_1.6.2 sfheaders_0.4.3
[7] maptools_1.1-8 sp_1.6-0 spatstat_3.0-7 spatstat.linnet_3.1-3 spatstat.model_3.2-8 rpart_4.1.21
[13] spatstat.explore_3.2-5 nlme_3.1-163 spatstat.random_3.2-2 spatstat.geom_3.2-7 spatstat.data_3.0-3 sf_1.0-14
[19] ggplot2_3.4.4 tmap_3.3-4 dplyr_1.1.4 purrr_1.0.2

loaded via a namespace (and not attached): [1] tidyselect_1.2.0 viridisLite_0.4.2 farver_2.1.1 fastmap_1.1.1 leaflet_2.2.1 XML_3.99-0.15 digest_0.6.33
[8] lifecycle_1.0.4 terra_1.7-55 magrittr_2.0.3 compiler_4.3.2 progress_1.2.2 rlang_1.1.2 tools_4.3.2
[15] utf8_1.2.4 prettyunits_1.2.0 htmlwidgets_1.6.2 classInt_0.4-10 RColorBrewer_1.1-3 abind_1.4-5 KernSmooth_2.23-22
[22] withr_2.5.2 foreign_0.8-85 leafsync_0.1.0 grid_4.3.2 polyclip_1.10-6 fansi_1.0.5 e1071_1.7-13
[29] leafem_0.2.3 colorspace_2.1-0 globals_0.16.2 scales_1.2.1 iterators_1.0.14 spatstat.utils_3.0-4 dichromat_2.0-0.1
[36] cli_3.6.1 crayon_1.5.2 generics_0.1.3 rstudioapi_0.15.0 future.apply_1.11.0 tmaptools_3.1-1 DBI_1.1.3
[43] proxy_0.4-27 splines_4.3.2 stars_0.6-4 parallel_4.3.2 base64enc_0.1-3 vctrs_0.6.4 Matrix_1.6-3
[50] hms_1.1.3 tensor_1.5 listenv_0.8.0 crosstalk_1.2.0 units_0.8-4 goftest_1.2-3 parallelly_1.36.0
[57] lwgeom_0.2-13 codetools_0.2-19 gtable_0.3.4 deldir_1.0-9 raster_3.6-26 munsell_0.5.0 tibble_3.2.1
[64] pillar_1.9.0 htmltools_0.5.7 R6_2.5.1 lattice_0.22-5 png_0.1-7 class_7.3-22 Rcpp_1.0.11
[71] spatstat.sparse_3.0-3 mgcv_1.9-0 pkgconfig_2.0.3 …

anjelinejeline commented 11 months ago

I also found this answer https://stackoverflow.com/questions/72942390/error-receiving-results-from-r-future-running-on-rstudio-server-on-slurm

I have run the code using the terminal so I guess that the version of R studio is not a problem .. but for information the version is RStudio 2023.06.2 Build 561

with regard to the file sizes: the error message says The total size of the 9 globals exported is 1.01 MiB. is this size a problem?

The last part of the error was a warning

In addition: Warning message: In mccollect(jobs = jobs, wait = TRUE) : 1 parallel job did not deliver a result Execution halted

caught segfault address 0x562a84ec784a, cause 'memory not mapped'

HenrikBengtsson commented 11 months ago

Hello, I won't be able to respond to this right now, but I'll migrate this issue to the GitHub Discussions, because this is most likely not a bug.