Closed NaroaCS closed 1 year ago
I tried with GAMA version 1.9 (here) and it seems to work well. You should use this version rather than GAMA 1.8.2. However, with the default options, GAMA will not run 40 simulations in parallel, but only 15 corresponding to the number of repetitions. If you really want to run 40 in parallel, you have to set to "true" this preference: "Execution" -> "Parallelism" -> "In batch mode, allow to run simulations with different parameter sets..."
Or can add this line in the init of the experiment:
gama.pref_parallel_simulations_all <- true;
Hey, It also exists an option for the batch script wrapper, -hpc (see more on the -help of the script) that allows to define number of core gama will use distributing experiment runs, Kevin
Thanks so much to both of you - it is working now!! π Here's what I did:
-hpc 40
in the bash command, it started to launch the 15 repetitions in parallelgama.pref_parallel_simulations_all <- true;
in the experiment init, it launched the 40 in parallel.@chapuisk the -hpc
option should be needed only when you want to limit the number of threads, by default it's supposed to take every resources available.
@NaroaCS can you verify if it's still works with patrick's suggestion but without the hpc option? If not, it'll be another issue π
Just tested:
-hpc
now that I have gama.pref_parallel_simulations_all <- true;
gama.pref_parallel_simulations_all <- true;
I understand that it should be running 15 threads, but it only does so if I include the -hpc
@NaroaCS So even on GAMA 1.9, w/o the -hpc
option and w/o gama.pref_parallel_simulations_all <- true;
, you only run 4 simulations in parallel ?
@RoiArthurB If it is the case, could you open a new issue specifically on this ?
@AlexisDrogoul from what I understand, with the pref_parallel_simulations_all
everything works as expected, but without it, it needs the -hpc
parameter to have more than 4 simulations...
That's quite strange, but it seems quickly fixable by enabling pref_parallel_simulations_all
by default for batch headless (which, IMO, is some expected behavior) π€π€π€
OK. I agree with this change. At first, I thought that we might not need to set the preference (as it might have other side effects), but it appears that the exploration algorithms do not factorise a lot their calls to this preference, so there is no single place where we could write something like if (GamaExecutorService.CONCURRENCY_SIMULATIONS_ALL.getValue() || experiment.isBatch() && experiment.isHeadless())...
... unless of course you create a new method in GamaExecutorService
and call that method instead.
So I guess it is ok to set the value of the preference launching a new experiment.
I've committed a solution -- please test and close the issue if it is the intended behaviour.
The commit is here: https://github.com/gama-platform/gama/commit/ba1f1ee05e57457ffe677b05e6ef857f9fe3d198
(somehow, my first line has disappeared).
I just tested it with this version GAMA_1.9.0_Linux_with_JDK_03.13.23_648d692c.zip (sorry I didn't know how to download it from the commit).
Now, w/o pref_parallel_simulations_all
and w/o -hpc
:
parallel: 40
in the experiment, it correctly launches 40 simulationsparallel: 15
in the experiment, it launches 15 parallel
option, it also launches 40 sims because that is the number of threadsIf I do it w/o pref_parallel_simulations_all
but setting -hpc 15
, the -hpc
seems not to do anything, and it still launches 40 sims.
@RoiArthurB if my understanding is correct, -hpc
should be treated like parallel:
, right ?
So we should probably work a bit on the priority of these options.
Right now, we have:
-hpc
, no parallel:
: pref_parallel_simulations_all
is considered as true and all cores are used. -hpc
, parallel:
defined: pref_parallel_simulations_all
is considered as true but in the limit defined by parallel
-hpc
defined, parallel:
defined or not: it seems that the value of -hpc
is not considered.In my opinion, when the modeler wants to limit the number of threads, -hpc
should have the priority, followed by parallel:
. The easiest solution would then be to translate the value passed from -hpc
to parallel:
before launching the experiment. What do you think ?
Not sure if this is a related issue or maybe it is unrelated. Let me know and I can move it to a new Issue if needed.
Currently, I am running 15 threads in parallel in a server that has 1 T of RAM and 40 threads.
I have set the specs in Gama.ini as follows:
-Xms4096m
-Xmx100g
-Xss1g
-Xmn50g
But I keep getting this error after some hours of execution:
Message: Your system is running out of memory. GAMA will exit now. Please try to quit other applications and relaunch it
Exception in thread "Timer-0" java.lang.OutOfMemoryError: Java heap space
at java.prefs/java.util.prefs.FileSystemPreferences.sync(FileSystemPreferences.java:768)
at java.prefs/java.util.prefs.FileSystemPreferences.flush(FileSystemPreferences.java:844)
at java.prefs/java.util.prefs.FileSystemPreferences.syncWorld(FileSystemPreferences.java:484)
at java.prefs/java.util.prefs.FileSystemPreferences$3.run(FileSystemPreferences.java:451)
at java.base/java.util.TimerThread.mainLoop(Timer.java:566)
at java.base/java.util.TimerThread.run(Timer.java:516)
naroa@matlaberp4:~/GAMA_1.9.0_march13/headless$
This is the behavior of the server:
It seems that the memory use is very low, but the cache and buffer are high (green plot), but may not be due to this process, since it was already high, it seems, before I launched anything. Does the cache and buffer memory trigger the OutOfMemory error? I also see some peaks in the load, which I am not sure should have happened, but not sure if this can also have an impact on memory usage. Do you have any ideas on why this might be happening? Is Gama accumulating data that I should be clearing? I am saving everything that I need on .csv-s, so I wouldn't need any other info on the simulations.
Thanks for your help!
In my opinion, when the modeler wants to limit the number of threads,
-hpc
should have the priority, followed byparallel:
. The easiest solution would then be to translate the value passed from-hpc
toparallel:
before launching the experiment. What do you think ?
@AlexisDrogoul I aggree and will commit something to apply it.
Currently, I am running 15 threads in parallel in a server that has 1 T of RAM and 40 threads.
I have set the specs in Gama.ini as follows:
-Xms4096m -Xmx100g -Xss1g -Xmn50g
But I keep getting this error after some hours of execution:
Message: Your system is running out of memory. GAMA will exit now. Please try to quit other applications and relaunch it Exception in thread "Timer-0" java.lang.OutOfMemoryError: Java heap space at java.prefs/java.util.prefs.FileSystemPreferences.sync(FileSystemPreferences.java:768) at java.prefs/java.util.prefs.FileSystemPreferences.flush(FileSystemPreferences.java:844) at java.prefs/java.util.prefs.FileSystemPreferences.syncWorld(FileSystemPreferences.java:484) at java.prefs/java.util.prefs.FileSystemPreferences$3.run(FileSystemPreferences.java:451) at java.base/java.util.TimerThread.mainLoop(Timer.java:566) at java.base/java.util.TimerThread.run(Timer.java:516) naroa@matlaberp4:~/GAMA_1.9.0_march13/headless$
@NaroaCS If you still start the headless with the bash script, then the configuration file Gama.ini
isn't read. Actually, you should use the parameter -m
which gonna set the eclipe parameter -Xmx
as follow :
$ bash gama-headless.sh -m 100g -batch batch_experiments_pheromone ../../VehicleClustering/CS_CityScope_GAMA/models/clustering.gaml
This is because the headless is started with Java "by hand" (not through and binary as GAMA GUI) and we never tougth about reading this file... But it might be a great default behavior, will make it too π€
Also, we do not recommand to set RAM per thread (with -Xss or -Xmn) and let GAMA scale in the full RAM and max cores itself (it can allow GAMA to better dynamically allocate RAM to threads when needed, etc).
Do you have any ideas on why this might be happening? Is Gama accumulating data that I should be clearing? I am saving everything that I need on .csv-s, so I wouldn't need any other info on the simulations.
I think it's because you didn't allow max memory properly (and it was limited at 4GB).
Also, if you want to make better use of the RAM over long batch, you can add the facet keep_simulations: false
to your experiment. By defaut, this facet is on true as it's keeping ended simulations in RAM to allow drawing some charts over every simulations; but as you said that you don't need any tracks, you can add this to have the garbage collector erasing every ended simulation from the memory ;)
Thanks so much, @RoiArthurB !! :D I've just launched the simulations, so I'll need to wait for a day or so to see if it stays alive. I'll keep you posted!
That was it! I'm not getting the memory error anymore! Thanks so much π
I think that's enough for your initial problem, I'm closing this issue and leave @RoiArthurB open one new for the -hpc
problem and one for reading parameters from the ini
file.
Describe the bug I am running a batch experiment on a server that has 40 threads ( 2 CPU x 10 thread/CPU x 2 hyperthreading). I have defined the parameter
parallel: 40
when defining the experiment in the .gaml script, but when I launch it, it only runs 4 threads in parallel.To Reproduce I guess it is difficult to reproduce, but this is how I have defined the experiment: experiment batch_experiments_pheromone type: batch parallel: 40 repeat: 15 until: (cycle >= numberOfDays numberOfHours 3600 / step) { parameter var: evaporation among: [0.05, 0.1, 0.15, 0.2,0.25,0.3]; parameter var: exploitationRate among: [0.6,0.65,0.7, 0.75, 0.8]; parameter var: numBikes among: [150, 250, 350]; parameter var: WanderingSpeed among: [1/3.6#m/#s,3/3.6#m/#s,5/3.6#m/#s]; }
and here is the full code: https://github.com/CityScope/VehicleClustering.git
I am running the script with the following command, which I execute in the 'headless' folder of GAMA:
bash gama-headless.sh -batch batch_experiments_pheromone ../../VehicleClustering/CS_CityScope_GAMA/models/clustering.gaml
Expected behavior I would expect GAMA to make use of all the threads available instead of only 4. The way that I see this is because it prints 'FINISH INITIALIZATION' four times in a row, runs the four experiments, and repeats the process once it is done. See the screenshot below.
Screenshots
Desktop (please complete the following information):