Open choisy opened 5 years ago
First insights on 1 simple simulation on the SIR.gaml model: results for rama and headless seem quite similar.
In R, I load an experiment:
gaml_file <- system.file("examples", "sir.gaml", package = "rama")
exp1 <- load_experiment("sir", gaml_file, "sir")
and evaluate only the time of the experiment run:
system.time(output <- run_experiment(exp1))
The time between rama and gama headless is quite similar:
Super nice. Probably very similar if we'd do repetitions.
With repetitions, results are much more different. I run the following experiment (see the attached xml file). sir9.xml.zip
gaml_file <- system.file("examples", "sir.gaml", package = "rama")
df <- expand.grid(S0 = c(900, 950, 999),
[sir9.xml.zip](https://github.com/r-and-gama/rama/files/2689590/sir9.xml.zip)
I0 = c(100, 50, 1),
R0 = 0,
beta = 1.5,
gamma = .15,
S = 1,
I = 1,
R = 1,
tmax = 1000,
seed = 1)
df
exp4 <- experiment(df, parameters = c(1:5),
obsrates = c(6:8), tmax = "tmax", seed = "seed",
experiment = "sir", model = gaml_file)
exp4
system.time(output <- run_experiment(exp4,8))
Results I get:
Notice that when I run the experiment with only 1 core, with:
system.time(output <- run_experiment(exp4))
it takes around 60s.
In run_experiment(exp), we do:
It may be not very surprising that run_experiment takes more time. We can try to improve this.
Yes, makes sense and it would be great if we could improve this.
Le 18 déc. 2018 à 12:29, Marc Choisy <notifications@github.com mailto:notifications@github.com> a écrit :
Yes, makes sense and it would be great if we could improve this.
for (i in 1:nrow(exp4)) {print(paste(system.time(output <- run_experiment(exp4[1:i,],8))))}
user system elapsed
Running experiment plan ...[1] "1.078" "0.0840000000000001" "11.4579999999999" "19.437" "1.211"
Running experiment plan ...[1] "2.096" "0.0899999999999999" "15.046" "29.973" "1.629"
Running experiment plan ...[1] "3.245" "0.1" "17.5900000000001" "40.556" "1.875"
Running experiment plan ...[1] "4.24" "0.123" "19.7769999999998" "52.0940000000001" "2.228"
Running experiment plan ...[1] "5.529" "0.138" "23.287" "70.469" "2.601"
Running experiment plan ...[1] "6.505" "0.151" "30.473" "103.834" "3.271"
Running experiment plan ...[1] "8.289" "0.199" "30.8009999999999" "107.017" "3.689"
Running experiment plan ...[1] "8.79899999999999" "0.147" "34.106" "127.091" "3.824"
Running experiment plan ...[1] "9.857" "0.179" "39.9360000000001" "141.351" "3.506”
I am checking if it is possible to plot some lines. best Jean-Daniel
What do you mean by "plot some lines"?
For information, there are a number of packages in R that allow good benchmarking and vizualization of results, see here for example. Also, there is the newly-released bench
package. I haven't tried it yet, and I'm not even quite sure that's the tool we need here. To be explored...
Here you are also running an experiment with an increasing number of simulations right? I guess that what you're aiming at here is seeing how the total simulation time scales with the number of simulations right? (and also estimating the rama
overhead too right?) If yes, then I would recommend using exactly the same simulation each time. Indeed, since all the simulations of the exp4
object are different, it's impossible for now to see whether the observed time differences are due uniquely to the number of simulations or also to the nature of these simulations. See what I mean? Such an experiment with the same simulation repeated a large number of time could be generated with the repl()
function, for example:
exp5 <- repl(exp4[1, ], 10)
Here you are also running simulation on 8 "CPU" in parallel. It would be interesting to kind of assess the overhead of the parallelization too.
As a more general comment, I see that we are doing some bits of tests here and there. Maybe that would be a better approach to design a formal benchmarking test that we all agree on, specifying each time what are the things that we are interested in timing (rama
overhead, parallelization overhead, scaling with number of experiments (linear vs non-linear), etc...). And, finally, such a benchmark should ideally be run on an "isolated" machine (i.e. not too many services running at the same time, minimum would be to cut wifi and bluetooth I guess).
An Rmd vignette / website article on this issue of benchmarking would be really really nice. And absolutely key in the perspective of a publication. Also, a benchmarking comparing rama
with RNetLogo
and rrepast
on the same model would be great too.
Would be interesting to compare the speeds of GAMA 1.7 and 1.8 too.
I hear many of you complaining by the fact that R/rama is incredibly slow compared to GAMA GUI. Can somebody do some benchmarking here so that we have some numbers to compare: GAMA GUI, GAMA headless, and R rama. Is it possible to time the time it takes just for launching GAMA in the headless? I guess it should roughly be the time in headless of a simple model with few agents and just 1 time step right? Anyway, having numbers to compare here would be useful to see where the problem might be.