precice / dealii-adapter

A coupled structural solver written with the C++ finite element library deal.II
GNU Lesser General Public License v3.0
19 stars 12 forks source link

Enable shared memory parallelization and print some git information #36

Closed davidscn closed 3 years ago

davidscn commented 3 years ago

..this was a stupid copy bug. I disabled the shared memory parallelization accidentally.

I took the opportunity to add some more information on the screen.

davidscn commented 3 years ago

There is another problem with the SMP: The number of threads is automatically determined by TBB and associated to the available system resources, i.e., the number of threads to run is set in such a way that all of the cores in a node are spoken for. In our case, this is most of the time undesired since some processes are required for other participants. Therefore, we need to set a thread limit. Any opinions on how to specify the limit? You can vote now for

But note that the command line solution would be a 'free' number without any word specifier.

uekerman commented 3 years ago

There is another problem with the SMP: The number of threads is automatically determined by TBB and associated to the available system resources, i.e., the number of threads to run is set in such a way that all of the cores in a node are spoken for. In our case, this is most of the time undesired since some processes are required for other participants. Therefore, we need to set a thread limit. Any opinions on how to specify the limit? You can vote now for

* an environment variable DEAL_II_NUM_THREADS
* a parameter in the parameter file
* a command line argument

But note that the command line solution would be a 'free' number without any word specifier.

I vote for an environment variable. That's also what OpenMP uses, for example. So, a behavior that probably many expect. Documentation is the most important thing here. The explicit output for the number of threads is already very good. Would the variable then be the "number of threads" or the "maximum number of threads"?

All in all I don't see this case so critical. On a cluster you typically run both participants on separate nodes. Some systems even enforce this (Hazelhen IIRC).

davidscn commented 3 years ago

Would the variable then be the "number of threads" or the "maximum number of threads"?

Alright. In the current code, DEAL_II_NUM_THREADS is directly the number of threads, so that the total number of cores could be exceeded, but I would change it to min(DEAL_II_NUM_THREADS, n_cores).

uekerman commented 3 years ago

Would the variable then be the "number of threads" or the "maximum number of threads"?

Alright. In the current code, DEAL_II_NUM_THREADS is directly the number of threads, so that the total number of cores could be exceeded, but I would change it to min(DEAL_II_NUM_THREADS, n_cores).

This I would not do. Better to make things explicit. Let's directly take DEAL_II_NUM_THREADS as the number of threads.

MakisH commented 3 years ago

I can also imagine that people want to try out if increasing the threads further can give them any additional performance benefit. Since OpenMP also uses an environment variable, let's just take this directly. It is more predictable.

davidscn commented 3 years ago

Alright, it's now explicit.