ropensci / nlrx

nlrx NetLogo R
https://docs.ropensci.org/nlrx
GNU General Public License v3.0
77 stars 12 forks source link

GenAlg number of simulation runs #89

Open nik1393 opened 1 year ago

nik1393 commented 1 year ago

How can I determine the number of simulation runs that will be executed by nlrx before the completion of all GenAlg runs? When attempting to print nl, the displayed table showed "number of runs calculated: ✗."

all-the-way-down-turtles commented 1 year ago

Hi! As I understood it you don't need to define the number of runs calculated, because there is no input matrix with a GenAlg algorithm. The parameter values the algorithm uses are not predetermined in the input matrix, they depend on the previous iteration. The number of simulations is determined by the popSize and iters parameters. This code should work:

Attaching a genetic algorithm simdesign to an nl object

nl@simdesign <- simdesign_GenAlg(nl = nl, popSize = 300, iters = 100, evalcrit = 1, #first argument of metrics, algorithm automatically minimizes the result elitism = NA, mutationChance = NA, nseeds = 1)

Run algorithm with run_nl_dyn

results <- run_nl_dyn(nl = nl, seed = getsim(nl, "simseeds")[1])

Attach results to nl

setsim(nl, "simoutput") <- results

Store nl object

saveRDS(nl, file.path(nl@experiment@outpath, "genAlg.rds"))

nik1393 commented 1 year ago

Thank you for your time and reply! Just a clarifying question related to you answer. Does that mean that in your example GenAlg will make 30000 runs (popSize * iters)? Or there will be different number of runs once your GenAlg experiment is over?

all-the-way-down-turtles commented 1 year ago

No problem. Unfortunately I am not an expert myself. With my code I let the model search 100 times for better parameters. The population parameter refers to the number of individuals (also known as chromosomes or solutions) in each generation of the algorithm. It determines the diversity and exploration capability of the algorithm. (However, I don't really understand this parameter myself tbo) When I try my example with three seeds, each seed takes a different amount of time. So my assumption would be, that it stops after the threshold of the evalcrit is reached and that you cannot decide how long this will take prior.

Maybe have a look at the upper left figure in Salecker et al. (2019). Here you can see, that the model searches 100 times for better parameters, each run depending on the previous run. If this parameter would the 10, the algorithm would stop adapting the model parameters after 10 iterations (as seen in my plot below).

From what I found in literature it is very difficult to decide on an appropriate value for the population parameter. Two low is not good, because there is too little variation in the population. However, it should take less time then. When the parameter is very high it's also not perfect, because it will take very long and there is too much "chaos". I think this depends on your model. If it is very complex, maybe start with small values, because otherwise it will take forever.

I tried my model with quite small population and iteration values at first. Usually the algorithm will not significantly further improve after a certain number of iterations. Hope this helps. genAlg_best_plot.pdf

nik1393 commented 1 year ago

Thank you so much again. I got what you mean. I'll keep trial runs for now to see how much time it will take for different pop size and iters to run experiments and which results they give.