Run_nl_all() freeze without error message

To whom it may concern,

Thanks for the awesome package. Would you please help me with an issue of hanging runs?

1 problem description

I developed a NetLogo model to describe water and chemical flow in a river network. When I was using nlrx for global sensitivity analysis, I came up with an issue, the run_nl_all() was working fine at first; but after random iterations, the run was hanging and freezing. The screenshots for JAVA and R processes were shown below.

K1FZD(5 {$8$}WG FJRS3FL

$ORAE6$L G_~N5S%IIT{SHM4$

2 what I have tried

I have updated the R, JAVA and nrlx for the latest version.
I have re-ran the model using NetLogo and it was fine without an error.
I have tried the scripts in my MacBook pro and got the same issue.
I have dug into the temp file and run the bash file with model and set up file directly. I can run the bash file manually and get them going.
I have tried the generated xml (with seed and variables from nlrx) in NetLogo, which means that I copied the content in .xml file and pasted it into .nlogo. It worked fine over there too.

3 my suspensions

After all the digging and debugging, I suspect the issue might cause by the communications among R, JAVA, and Netlogo.

4 Inquires

It is too difficult for me to solve this "hanging" issue. So any thought, ideas, or comments will be appreciated. Thank you in advance.

Best, Wenlong

appendix a: my nrlx script:

nl <- nl(nlversion = "6.2.0",
         nlpath = netlogopath,
         modelpath = modelpath,
         jvmmem = 8024)

nl@experiment <- experiment(expname="ditch_pond",
                            outpath=outpath,
                            # no repetition.
                            repetition=1,
                            tickmetrics="false",
                            idsetup="setup",
                            idgo="go",
                            runtime=10000, # maximum ticks for each run
                            #evalticks=seq(40,50),
                            # stop condition to exit the simulation.
                            # stop condition update to 99.
                            stopcond = "removal-efficiency > 99",
                            metrics=c("removal-efficiency"),
                            variables = list('removalRate' = list(min=0.0024, max=0.4, qfun="qunif"),
                                             'initialConcentration' = list(min=1, max=100, qfun="qunif"),
                                             'Ditch_depth_ratio' = list(min=-99, max=100, qfun="qunif"),
                                             'Ditch_area_ratio' = list(min=-99, max=100, qfun="qunif"),
                                             'Water_flow_ratio' = list(min=-99, max=100, qfun="qunif")),
)

nl@simdesign <- simdesign_sobol2007(nl=nl,
                                      samples=1000,
                                      sobolnboot=200,
                                      sobolconf=0.95,
                                      # random seeds will multiply the simulations.
                                      nseeds=3,
                                      precision=1)

plan(multisession)
results <- run_nl_all(nl, split=8)

Appendix B: print(nl)

supported nlversion: ✗
nlpath exists on local system: ✓
modelpath exists on local system: ✓
valid jvm memory: ✓
valid experiment name: ✓
outpath exists on local system: ✓
setup and go defined: ✓
variables defined: ✓
variables present in model: ✓
constants defined: ✗
constants present in model: ✗
metrics defined: ✓
spatial Metrics defined: ✓
simdesign attached: ✓
siminput parameter matrix: ✓
number of siminputrows: 7000
number of random seeds: 3
estimated number of runs: 21000
simoutput results attached: ✗
number of runs calculated: ✗

Appendix C: system information

JAVA -version
openjdk version "1.8.0_332"
OpenJDK Runtime Environment (build 1.8.0_332-b09)
OpenJDK Server VM (build 25.332-b09, mixed mode)

R version 4.1.2 (2021-11-01) -- "Bird Hippie"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

NetLogo 6.2.0

Windows 10 and MacOS 10.14.6

Hi there,

thanks for your detailed report and sorry for the late response. Due to time restrictions I was not really able to dig into nlrx related issues earlier.

There can be many reasons for such behaviour. From your output I see that you are running 21000 simulations in total. With that large number of simulations, it can just happen randomly that some hickup in R/NetLogo/Java lead to a stalled session. Another possibility which happens quite often in my experience is a parameterization that leads to a never-ending simulation in NetLogo or triggers an out-of-memory state. Often, large parameter spaces are tested for the first time when running such simulations. Thus, it is not completely unexpected that some edge cases will come up which blow up the model.

The best strategy here in my opinion is to split up the job into many smaller chunks. For example, you could splice up the parameter matrix (nl@simdesign@siminput) into chunks of 100 rows, store the intermediate results as rds and combine everything together in the end.

Feel free to re-open if you need more pointers on debugging stalled nlrx/NetLogo simulations.

ropensci / nlrx