flu-plus-plus / bachelorproef

Bachelorproef Informatica Universiteit Antwerpen 2016–2017
Other
0 stars 4 forks source link

Use Intel TBB as an alternative to OpenMP #39

Closed jonathanvdc closed 7 years ago

jonathanvdc commented 7 years ago

This PR adds support for the Intel® Threading Building Blocks library. This is C++, so naturally I implemented everything using preprocessor magic. To spare mere mortals such as myself the pain of deciphering largely irrelevant #ifdefs in sim/Simulator.cpp, I created util/Parallel.h and stuffed all the preprocessor directives in there.

util/Parallel.h exposes the following interface:

/// Gets the number of threads that are available for parallelization.
unsigned int get_number_of_threads();

/// Tries to set the number of threads to the given value. A Boolean flag
/// is returned that specifies if the number of threads could be set
/// successfully.
bool try_set_number_of_threads(unsigned int number_of_threads);

/// Applies the given action to each element in the given list of values.
/// The action may be applied to up to num_threads elements simultaneously.
/// An action is a function object with signature `void(T&, unsigned int)`
/// where the first parameter is the value that the action takes and the second
/// parameter is the index of the thread it runs on.
template <typename T, typename TAction>
void parallel_for(std::vector<T>& values, unsigned int num_threads, const TAction& action);

/// Applies the given action to each element in the given list of values.
/// The action is not applied to multiple elements simultaneously.
/// An action is a function object with signature `void(T&, unsigned int)`
/// where the first parameter is the value that the action takes and the second
/// parameter is the index of the thread it runs on.
template <typename T, typename TAction>
void serial_for(std::vector<T>& values, const TAction& action);

At compile-time one of the following threading mechanisms is used to implement this interface:

I threw in the STL threads implementation because I thought it might be interesting. It turned out to be competitive performance-wise with both OpenMP and TBB and does not have any dependencies other than the C++ standard library.

To enable the user to switch between threading mechanisms, I modified our Makefile to accept the STRIDE_THREADING_LIBRARY environment variable, which can be one of OpenMP, TBB, STL or none. If no STRIDE_THREADING_LIBRARY is given, then a threading mechanism will be picked automatically by picking the first available library from the following list: OpenMP, TBB, STL.

Since clang doesn't seem to play ball with OpenMP, I created two clang Travis configs: the first (clang-4.0) uses STL threads for parallelization. The second (clang-4.0-tbb) uses TBB.