ropensci / nlrx

nlrx NetLogo R
https://docs.ropensci.org/nlrx
GNU General Public License v3.0
77 stars 12 forks source link

Why `nlrx` takes soooooo long to run on my computer? #57

Closed yyzeng closed 2 years ago

yyzeng commented 2 years ago

Hereafter is my codes:

# https://stackoverflow.com/a/59768990/16802797

library(nlrx)
# Windows default NetLogo installation path (adjust to your needs!):
netlogopath <- file.path("C:/Program Files/NetLogo 6.2.0")
modelpath <- file.path(netlogopath, "app/models/Sample Models/Earth Science/Fire.nlogo")
outpath <- file.path(".")
nl <- nl(nlversion = "6.2.0",
         nlpath = netlogopath,
         modelpath = modelpath,
         jvmmem = 1024)

## Example 1: simdesign_distinct
nl@experiment <- experiment(expname="fire",
                            outpath=outpath,
                            repetition=1,
                            tickmetrics="true",
                            idsetup="setup",
                            idgo="go",
                            runtime=0, 
                            metrics=c("ifelse-value (initial-trees > 0) [(burned-trees / initial-trees) * 100][0]"),
                            variables = list('density' = list(values=seq(0, 99, 1))),
                            constants = list())

#### use nseeds = 10 to simulate over 10 different random seeds (replicates)
nl@simdesign <- simdesign_distinct(nl, nseeds = 10)

library(future)
plan(multisession)
tictoc::tic()
results <- progressr::with_progress(run_nl_all(nl))
tictoc::toc()

setsim(nl, "simoutput") <- results

The codes took almost two hours to run in my computer with Intel(R) Xeon(R) CPU E3-1505M v6 @ 3.00GHz and 32G RAM.

But if I used netlogo's BehaviorSpace to do the same job, it finished in 1 minute! Below is the xml setting I distill from the saved .nlogo file.

<experiments>
  <experiment name="fire_ex" repetitions="10" runMetricsEveryStep="true">
    <setup>setup</setup>
    <go>go</go>
    <metric>ifelse-value (initial-trees &gt; 0) [(burned-trees / initial-trees) * 100][0]</metric>
    <steppedValueSet variable="density" first="0" step="1" last="99"/>
  </experiment>
</experiments>

Something wrong? Thank you!

sessioninfo::session_info()
# - Session info -------------------------------------------------------------------------
#   setting  value                         
# version  R version 4.1.1 (2021-08-10)  
# os       Windows 10 x64                
# system   x86_64, mingw32               
# ui       RStudio                       
# language (EN)                          
# collate  Chinese (Simplified)_China.936
# ctype    Chinese (Simplified)_China.936
# tz       Asia/Taipei                   
# date     2021-09-25                    
# 
# - Packages -----------------------------------------------------------------------------
#   package     * version date       lib source        
# assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.1.1)
# cli           3.0.1   2021-07-17 [1] CRAN (R 4.1.1)
# codetools     0.2-18  2020-11-04 [2] CRAN (R 4.1.1)
# crayon        1.4.1   2021-02-08 [1] CRAN (R 4.1.1)
# DBI           1.1.1   2021-01-15 [1] CRAN (R 4.1.1)
# digest        0.6.27  2020-10-24 [1] CRAN (R 4.1.1)
# dplyr         1.0.7   2021-06-18 [1] CRAN (R 4.1.1)
# ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.1.1)
# fansi         0.5.0   2021-05-25 [1] CRAN (R 4.1.1)
# furrr         0.2.3   2021-06-25 [1] CRAN (R 4.1.1)
# future      * 1.22.1  2021-08-25 [1] CRAN (R 4.1.1)
# generics      0.1.0   2020-10-31 [1] CRAN (R 4.1.1)
# globals       0.14.0  2020-11-22 [1] CRAN (R 4.1.1)
# glue          1.4.2   2020-08-27 [1] CRAN (R 4.1.1)
# lifecycle     1.0.0   2021-02-15 [1] CRAN (R 4.1.1)
# listenv       0.8.0   2019-12-05 [1] CRAN (R 4.1.1)
# magrittr      2.0.1   2020-11-17 [1] CRAN (R 4.1.1)
# nlrx        * 0.4.3   2021-09-20 [1] CRAN (R 4.1.1)
# parallelly    1.28.1  2021-09-09 [1] CRAN (R 4.1.1)
# pillar        1.6.2   2021-07-29 [1] CRAN (R 4.1.1)
# pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.1.1)
# progressr     0.9.0   2021-09-24 [1] CRAN (R 4.1.1)
# purrr         0.3.4   2020-04-17 [1] CRAN (R 4.1.1)
# R6            2.5.1   2021-08-19 [1] CRAN (R 4.1.1)
# rlang         0.4.11  2021-04-30 [1] CRAN (R 4.1.1)
# rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.1.1)
# sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.1.1)
# tibble        3.1.4   2021-08-25 [1] CRAN (R 4.1.1)
# tictoc        1.0.1   2021-04-19 [1] CRAN (R 4.1.1)
# tidyselect    1.1.1   2021-04-30 [1] CRAN (R 4.1.1)
# utf8          1.2.2   2021-07-24 [1] CRAN (R 4.1.1)
# vctrs         0.3.8   2021-04-29 [1] CRAN (R 4.1.1)
# withr         2.4.2   2021-04-18 [1] CRAN (R 4.1.1)
# 
# [1] C:/Users/admin/Documents/R/win-library/4.1
# [2] C:/Program Files/R/R-4.1.1/library
bitbacchus commented 2 years ago

Hmm, that is kinda odd. Can you please try to run the experiment from the command line (i.e. mimicking what nlrx does)?

yyzeng commented 2 years ago

Yes - I use windows' cmd.exe to run the codes, and it finish in 1 minite and show me lots results.

Hmm, that is kinda odd. Can you please try to run the experiment from the command line (i.e. mimicking what nlrx does)?

netlogo-headless.bat --model "C:\PATH\TO\YOUR\Fire.nlogo" \
  --experiment NAME_OF_YOUR_EXPERIMENT \
  --table -

Thanks, Sebastian

Thanks. Yes - I use windows' cmd.exe to run the codes, and it finish in 1 minite and show me lots results. By the way, I have reinstalled and tried netlogo 6.0.4, R 3.6 and nlrx 3.0. None of them is helpful. 😢

njirvine commented 2 years ago

Hi there

Any update on this issue? I'm having what seems to be a very similar problem but on a Mac. All code as copied from my nlxr example (wolf-sheep) and adjusted for my version of Netlogo.

library(nlrx)
Sys.setenv(JAVA_HOME = ("/Library/Java/JavaVirtualMachines/jdk-17.0.1.jdk/Contents/Home"))

netlogopath <- file.path("/Applications/NetLogo 6.2.1")
modelpath <- file.path(netlogopath, 
                       "models/Sample Models/Biology/Wolf Sheep Predation.nlogo")
outpath <- file.path("~/OneDrive - University of Strathclyde/ABM WORK/ABM REPOSITORY") 

nl <- nl(nlversion = "6.2.1",
         nlpath = netlogopath,
         modelpath = modelpath,
         jvmmem = 1024)

nl@experiment <- experiment(expname="wolf-sheep",
                            outpath=outpath,
                            repetition=1,
                            tickmetrics="true",
                            idsetup="setup",
                            idgo="go",
                            runtime=50,
                            evalticks=seq(40,50),
                            metrics=c("count sheep", "count wolves", "count patches with [pcolor = green]"),
                            variables = list('initial-number-sheep' = list(min=50, max=150, qfun="qunif"),
                                             'initial-number-wolves' = list(min=50, max=150, qfun="qunif")),
                            constants = list("model-version" = "\"sheep-wolves-grass\"",
                                             "grass-regrowth-time" = 30,
                                             "sheep-gain-from-food" = 4,
                                             "wolf-gain-from-food" = 20,
                                             "sheep-reproduce" = 4,
                                             "wolf-reproduce" = 5,
                                             "show-energy?" = "false"))

nl@simdesign <- simdesign_lhs(nl=nl,
                              samples=100,
                              nseeds=3,
                              precision=3)   

results <- run_nl_all(nl = nl)

I can't find any problem with the paths and don't get any errors just a very slow running R which is clearly trying to follow to commands but no hint of Netlogo opening up to start the experiment and I have to interrupt it after several minutes of nothing.

R, Netlogo, JAVA, and nlrx all latest versions R seems to accept all the the commands without error messages...

I'm not the most tech savvy when it comes to using the terminal to check functions/processes but all help pages and advice have been checked and genuinely can't see what I've done wrong...

bitbacchus commented 2 years ago

Sorry, I had no time to look into this issue lately. But I am planning to investigate this in December.

njirvine commented 2 years ago

Many thanks for any help

ZhanliSun commented 2 years ago

Same here. I am using the sample code. It takes ages...

njirvine commented 2 years ago

As an update, I've tried running on a parallel windows 10 desktop. As per yyzeng's experience. Using the command centre as administrator I can run the wolf-sheep experiment with results in a minute, but nlrx through netlogo is trying to something but failing.

I couldn't run it through command without administrator privileges (access denied which was the same on my mac desktop) but when i set up as admin on widows, no problems. Is there maybe something missing running R as a user?

bitbacchus commented 2 years ago

I finally found the time to deal with this issue. Sorry that it took so long.

The problem is with how NLRX currently works. Technically, for each simulation run, in @yyzeng s example 100 x 10 = 1000 simulations, a new NetLogo instance is started and the model is loaded. This takes about 4 sec on my computer, so a total of 4000 sec, just under 67 minutes. The simulations themselves probably take less than a second.

Ss a rule of thumb: Runtime in NLRX = # simulations x (4 s + simulation time)

If you use NetLogo's BehaviourSpace, NetLogo is started only once (or several times, if you use several cores), so you save the whole overhead.

The behavior of NLRX can be quite handy (or at least is less relevant) if you have simulations that take longer and/or you can distribute the simulations over many CPUs (e.g. on an HPC of a university). But if you want to run a lot of simulations, each of which is short, the overhead is unfortunately very significant.

Unfortunately, I do not have a short-term solution.

In principle, I have two ideas on how to improve the performance in the future:

(1) I could write a function to run experiments designed with BehaviourSpace with NLRX and no overhead. But I'm not sure if this is useful at all because then you could also run the simulations directly in Behaviourspace.

(2) there is the possibility to control NetLogo directly and run several simulations one after the other without having to restart NetLogo every time. You could define chunks, i.e. a certain number of simulations, which are executed on a NetLogo instance. However, this needs a bit more development effort and I don't think I'll get to it in 2022. But I will write an issue about it, maybe there is someone else who would like to do it.

What do you think?

njirvine commented 2 years ago

Thanks bitbacchus.

So for my understanding (I'm fairly new to all this) a shorter volume of long simulations should work ok?

For my own purposes, I was excited at the ability to run it and collect data directly into R especially the LHS rather than behaviour spaces forced-grouping for sensitivity analysis.

My model takes 5-10mins to run and collects a lot of data. Technically for robustness and accuracy I only need 350-400runs so may still be good for me but access to CPU beyond my own desktops is limited.

Personally, the second option absolutely makes sense but as a user who combines ABM with DES and long-times series, anything that will bypass the limitations of excel as the output Id be happy with that.

Thanks again for looking into this. Any other suggestions for how I can overcome this shortfall in behaviour space/excel would be appreciated from any direction

bitbacchus commented 2 years ago

So for my understanding (I'm fairly new to all this) a shorter volume of long simulations should work ok?

Well, it works for sure. You just have to add a 4 sec overhead for each parameter set/simulation run. The shorter the simulations the more the relative impact of the overhead.

For my own purposes, I was excited at the ability to run it and collect data directly into R especially the LHS rather than behaviour spaces forced-grouping for sensitivity analysis.

Yes, that was one of the main reasons to develop NLRX in the first place

My model takes 5-10mins to run and collects a lot of data. Technically for robustness and accuracy I only need 350-400runs so may still be good for me but access to CPU beyond my own desktops is limited.

Yes, for a 5-10 mins runtime 4 sec of overhead doesn't ma a difference at all. I'd use a parallel job e.g. with future then it is an overnight job.

Thanks again for looking into this. Any other suggestions for how I can overcome this shortfall in behaviour space/excel would be appreciated from any direction

The only way to overcome behaviour space's shortcomings that I see is to use NLRX ;-) I think I didn't get what Excel has to do here?

njirvine commented 2 years ago

I'm giving it a go. There may have been an issue with my r extension. I'd been playing around with it before. Now it all seems to work fine.

managed to run the wolf sheep example by playing around with the number of samples to gauge the time requirements and it all ran perfectly.

I'm running a simple sim on a parallel windows desktop for my model with 400 runs and will update on how it works

Thanks again, knowing the time limitations helps. I was maybe just being little impatient but understanding how it works under the hood

ps the excel thing was just alluding to the fact that my temporary solution (using behaviour space alone for the verification) was to do 20runs at a time 20 times as excel.csv holds just over a million rows and would drop everything after that without warning. the need to load separately then paste to create the full dataset (approx 20 000 000 rows) was a bind

jen-boyd commented 2 years ago

Hi, I'm having a similar issue with the nlrx package. When I run the wolf-sheep model it runs perfectly but when I try to run my own model it just runs endlessly and doesn't ever produce results. It's not an error in the actual model because I can run it fine in Netlogo (but not behaviour space). I think this is an issue with the R extension so just wondered if @njirvine had any ideas how to solve this? Thanks!

njirvine commented 1 year ago

Hi there

I’m no expert but I have managed to make a lot mistakes and work though them. Main lessons I learned were: Removal of any user interaction prompts Checking the parameters of experiment once you have saved as an object Read through the printed check of the experiment to make sure everything is ticked (green) except the results as they aren’t available

This may be too obvious to be relevant but have you set a tick end in the experiment?

Whenever mine was just running forever but not producing results it was usually an error in the instructions to the run the model not the model itself.

Get Outlook for iOShttps://aka.ms/o0ukef


From: jen-boyd @.> Sent: Thursday, June 16, 2022 4:15:18 PM To: ropensci/nlrx @.> Cc: Nicola Irvine @.>; Mention @.> Subject: Re: [ropensci/nlrx] Why nlrx takes soooooo long to run on my computer? (#57)

CAUTION: This email originated outside the University. Check before clicking links or attachments.

Hi, I'm having a similar issue with the nlrx package. When I run the wolf-sheep model it runs perfectly but when I try to run my own model it just runs endlessly and doesn't ever produce results. It's not an error in the actual model because I can run it fine in Netlogo (but not behaviour space). I think this is an issue with the R extension so just wondered if @njirvinehttps://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnjirvine&data=05%7C01%7Cnicola.irvine%40strath.ac.uk%7C2198253cff7945598e6708da4fab06bf%7C631e0763153347eba5cd0457bee5944e%7C0%7C0%7C637909893207685803%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=z28MfpkDrcPKFxfgIDipuReD1TY9%2F9svD4g6Pbsj2LM%3D&reserved=0 had any ideas how to solve this? Thanks!

— Reply to this email directly, view it on GitHubhttps://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fropensci%2Fnlrx%2Fissues%2F57%23issuecomment-1157780153&data=05%7C01%7Cnicola.irvine%40strath.ac.uk%7C2198253cff7945598e6708da4fab06bf%7C631e0763153347eba5cd0457bee5944e%7C0%7C0%7C637909893207685803%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=luKdPo5pvU62azJfjqrQ%2F6rnUQFrDy%2FxUrBdOr3ImWc%3D&reserved=0, or unsubscribehttps://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAQDLTXI3N742QDBQEZGYYSTVPNAINANCNFSM5EXSSOJA&data=05%7C01%7Cnicola.irvine%40strath.ac.uk%7C2198253cff7945598e6708da4fab06bf%7C631e0763153347eba5cd0457bee5944e%7C0%7C0%7C637909893207685803%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=1T4t%2BR%2BSVaJN3QCpkNINzdygPPzf%2FB7meo6QHXYL3so%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>