gama-platform / gama.old

Main repository for developing the 1.x versions of GAMA
GNU General Public License v3.0
304 stars 99 forks source link

Multi-thread experiment not using the specified number of threads #3658

Closed NaroaCS closed 1 year ago

NaroaCS commented 1 year ago

Describe the bug I am running a batch experiment on a server that has 40 threads ( 2 CPU x 10 thread/CPU x 2 hyperthreading). I have defined the parameter parallel: 40 when defining the experiment in the .gaml script, but when I launch it, it only runs 4 threads in parallel.

To Reproduce I guess it is difficult to reproduce, but this is how I have defined the experiment: experiment batch_experiments_pheromone type: batch parallel: 40 repeat: 15 until: (cycle >= numberOfDays numberOfHours 3600 / step) { parameter var: evaporation among: [0.05, 0.1, 0.15, 0.2,0.25,0.3]; parameter var: exploitationRate among: [0.6,0.65,0.7, 0.75, 0.8]; parameter var: numBikes among: [150, 250, 350]; parameter var: WanderingSpeed among: [1/3.6#m/#s,3/3.6#m/#s,5/3.6#m/#s]; }

and here is the full code: https://github.com/CityScope/VehicleClustering.git

I am running the script with the following command, which I execute in the 'headless' folder of GAMA: bash gama-headless.sh -batch batch_experiments_pheromone ../../VehicleClustering/CS_CityScope_GAMA/models/clustering.gaml

Expected behavior I would expect GAMA to make use of all the threads available instead of only 4. The way that I see this is because it prints 'FINISH INITIALIZATION' four times in a row, runs the four experiments, and repeats the process once it is done. See the screenshot below.

Screenshots image

image

Desktop (please complete the following information):

ptaillandier commented 1 year ago

I tried with GAMA version 1.9 (here) and it seems to work well. You should use this version rather than GAMA 1.8.2. However, with the default options, GAMA will not run 40 simulations in parallel, but only 15 corresponding to the number of repetitions. If you really want to run 40 in parallel, you have to set to "true" this preference: "Execution" -> "Parallelism" -> "In batch mode, allow to run simulations with different parameter sets..."

Or can add this line in the init of the experiment: gama.pref_parallel_simulations_all <- true;

chapuisk commented 1 year ago

Hey, It also exists an option for the batch script wrapper, -hpc (see more on the -help of the script) that allows to define number of core gama will use distributing experiment runs, Kevin

NaroaCS commented 1 year ago

Thanks so much to both of you - it is working now!! πŸ˜„ Here's what I did:

RoiArthurB commented 1 year ago

@chapuisk the -hpc option should be needed only when you want to limit the number of threads, by default it's supposed to take every resources available.

@NaroaCS can you verify if it's still works with patrick's suggestion but without the hpc option? If not, it'll be another issue πŸ™ƒ

NaroaCS commented 1 year ago

Just tested:

AlexisDrogoul commented 1 year ago

@NaroaCS So even on GAMA 1.9, w/o the -hpc option and w/o gama.pref_parallel_simulations_all <- true;, you only run 4 simulations in parallel ? @RoiArthurB If it is the case, could you open a new issue specifically on this ?

RoiArthurB commented 1 year ago

@AlexisDrogoul from what I understand, with the pref_parallel_simulations_all everything works as expected, but without it, it needs the -hpc parameter to have more than 4 simulations...

That's quite strange, but it seems quickly fixable by enabling pref_parallel_simulations_all by default for batch headless (which, IMO, is some expected behavior) πŸ€”πŸ€”πŸ€”

AlexisDrogoul commented 1 year ago

OK. I agree with this change. At first, I thought that we might not need to set the preference (as it might have other side effects), but it appears that the exploration algorithms do not factorise a lot their calls to this preference, so there is no single place where we could write something like if (GamaExecutorService.CONCURRENCY_SIMULATIONS_ALL.getValue() || experiment.isBatch() && experiment.isHeadless())...... unless of course you create a new method in GamaExecutorService and call that method instead. So I guess it is ok to set the value of the preference launching a new experiment.

AlexisDrogoul commented 1 year ago

I've committed a solution -- please test and close the issue if it is the intended behaviour.

AlexisDrogoul commented 1 year ago

The commit is here: https://github.com/gama-platform/gama/commit/ba1f1ee05e57457ffe677b05e6ef857f9fe3d198

(somehow, my first line has disappeared).

NaroaCS commented 1 year ago

I just tested it with this version GAMA_1.9.0_Linux_with_JDK_03.13.23_648d692c.zip (sorry I didn't know how to download it from the commit).

Now, w/o pref_parallel_simulations_all and w/o -hpc:

If I do it w/o pref_parallel_simulations_all but setting -hpc 15, the -hpc seems not to do anything, and it still launches 40 sims.

AlexisDrogoul commented 1 year ago

@RoiArthurB if my understanding is correct, -hpc should be treated like parallel:, right ?

So we should probably work a bit on the priority of these options.

Right now, we have:

In my opinion, when the modeler wants to limit the number of threads, -hpc should have the priority, followed by parallel:. The easiest solution would then be to translate the value passed from -hpc to parallel: before launching the experiment. What do you think ?

NaroaCS commented 1 year ago

Not sure if this is a related issue or maybe it is unrelated. Let me know and I can move it to a new Issue if needed.

Currently, I am running 15 threads in parallel in a server that has 1 T of RAM and 40 threads.

I have set the specs in Gama.ini as follows:

-Xms4096m
-Xmx100g
-Xss1g
-Xmn50g

But I keep getting this error after some hours of execution:

Message: Your system is running out of memory. GAMA will exit now. Please try to quit other applications and relaunch it
Exception in thread "Timer-0" java.lang.OutOfMemoryError: Java heap space
        at java.prefs/java.util.prefs.FileSystemPreferences.sync(FileSystemPreferences.java:768)
        at java.prefs/java.util.prefs.FileSystemPreferences.flush(FileSystemPreferences.java:844)
        at java.prefs/java.util.prefs.FileSystemPreferences.syncWorld(FileSystemPreferences.java:484)
        at java.prefs/java.util.prefs.FileSystemPreferences$3.run(FileSystemPreferences.java:451)
        at java.base/java.util.TimerThread.mainLoop(Timer.java:566)
        at java.base/java.util.TimerThread.run(Timer.java:516)
naroa@matlaberp4:~/GAMA_1.9.0_march13/headless$ 

This is the behavior of the server:

image

It seems that the memory use is very low, but the cache and buffer are high (green plot), but may not be due to this process, since it was already high, it seems, before I launched anything. Does the cache and buffer memory trigger the OutOfMemory error? I also see some peaks in the load, which I am not sure should have happened, but not sure if this can also have an impact on memory usage. Do you have any ideas on why this might be happening? Is Gama accumulating data that I should be clearing? I am saving everything that I need on .csv-s, so I wouldn't need any other info on the simulations.

Thanks for your help!

RoiArthurB commented 1 year ago

In my opinion, when the modeler wants to limit the number of threads, -hpc should have the priority, followed by parallel:. The easiest solution would then be to translate the value passed from -hpc to parallel: before launching the experiment. What do you think ?

@AlexisDrogoul I aggree and will commit something to apply it.


Currently, I am running 15 threads in parallel in a server that has 1 T of RAM and 40 threads.

I have set the specs in Gama.ini as follows:

-Xms4096m
-Xmx100g
-Xss1g
-Xmn50g

But I keep getting this error after some hours of execution:

Message: Your system is running out of memory. GAMA will exit now. Please try to quit other applications and relaunch it
Exception in thread "Timer-0" java.lang.OutOfMemoryError: Java heap space
        at java.prefs/java.util.prefs.FileSystemPreferences.sync(FileSystemPreferences.java:768)
        at java.prefs/java.util.prefs.FileSystemPreferences.flush(FileSystemPreferences.java:844)
        at java.prefs/java.util.prefs.FileSystemPreferences.syncWorld(FileSystemPreferences.java:484)
        at java.prefs/java.util.prefs.FileSystemPreferences$3.run(FileSystemPreferences.java:451)
        at java.base/java.util.TimerThread.mainLoop(Timer.java:566)
        at java.base/java.util.TimerThread.run(Timer.java:516)
naroa@matlaberp4:~/GAMA_1.9.0_march13/headless$ 

@NaroaCS If you still start the headless with the bash script, then the configuration file Gama.ini isn't read. Actually, you should use the parameter -m which gonna set the eclipe parameter -Xmx as follow :

$ bash gama-headless.sh -m 100g -batch batch_experiments_pheromone ../../VehicleClustering/CS_CityScope_GAMA/models/clustering.gaml

This is because the headless is started with Java "by hand" (not through and binary as GAMA GUI) and we never tougth about reading this file... But it might be a great default behavior, will make it too πŸ€”

Also, we do not recommand to set RAM per thread (with -Xss or -Xmn) and let GAMA scale in the full RAM and max cores itself (it can allow GAMA to better dynamically allocate RAM to threads when needed, etc).

Do you have any ideas on why this might be happening? Is Gama accumulating data that I should be clearing? I am saving everything that I need on .csv-s, so I wouldn't need any other info on the simulations.

I think it's because you didn't allow max memory properly (and it was limited at 4GB).

Also, if you want to make better use of the RAM over long batch, you can add the facet keep_simulations: false to your experiment. By defaut, this facet is on true as it's keeping ended simulations in RAM to allow drawing some charts over every simulations; but as you said that you don't need any tracks, you can add this to have the garbage collector erasing every ended simulation from the memory ;)

NaroaCS commented 1 year ago

Thanks so much, @RoiArthurB !! :D I've just launched the simulations, so I'll need to wait for a day or so to see if it stays alive. I'll keep you posted!

NaroaCS commented 1 year ago

That was it! I'm not getting the memory error anymore! Thanks so much πŸ˜ƒ

lesquoyb commented 1 year ago

I think that's enough for your initial problem, I'm closing this issue and leave @RoiArthurB open one new for the -hpc problem and one for reading parameters from the ini file.