BoostGSoC13 / odeint-v2

odeint - parallelization GSoC project
http://headmyshoulder.github.com/odeint-v2/
0 stars 0 forks source link

OpenMP state #5

Open neapel opened 11 years ago

mariomulansky commented 11 years ago

For my understanding: The OpenMP state is basically a collection of chunks of data - and then the omp_algebra will distribute the chunks across the OpenMP threads? For processing each chunk one then uses the algebra S that is given to the omp_algebra as template parameter? If this is correct I think one might not even need an openmp_operation.

neapel commented 11 years ago

I've implemented a simple algebra in 1725e45ac5a41b14ae0e that just uses a random access container like std::vector or array from multiple threads, but doesn't support a dispatcher. The system function is called once from the main thread and the user needs to take care of multi-threading there.

To get to a common parallel interface I think the system function should transparently be called from multiple threads by odeint, with partial views of the state, so the user won't have to worry about parallelization, or later, MPI, it would look the same.

ddemidov commented 11 years ago

Special OpenMP state may be required in order to initialize the underlying memory properly. This is important for performance on NUMA systems: each OpenMP thread should be the first to touch its chunk of memory (see e.g. this presentation for an explanation). This should probably be handled by odeint's resizer implementation for the state.

headmyshoulder commented 11 years ago

and you don't even need a separate state type :)

On 06/14/2013 11:39 PM, neapel wrote:

I've implemented a simple algebra in 1725e45 https://github.com/BoostGSoC/odeint-v2/commit/1725e45ac5a41b14ae0e that just uses a random access container like std::vector or array from multiple threads, but doesn't support a dispatcher. The system function is called once from the main thread and the user needs to take care of multi-threading there.

To get to a common parallel interface I think the system function should transparently be called from multiple threads by odeint, with partial views of the state, so the user won't have to worry about parallelization, or later, MPI, it would look the same.

— Reply to this email directly or view it on GitHub https://github.com/BoostGSoC/odeint-v2/issues/5#issuecomment-19483134.

mariomulansky commented 11 years ago

i do think an omp state is a good idea to have more control over the parallelization. how else could you specialize the resizer if you dont have an omp state type you can specialize with.

headmyshoulder commented 11 years ago

On 17.06.2013 10:29, Mario Mulansky wrote:

i do think an omp state is a good idea to have more control over the parallelization. how else could you specialize the resizer if you dont have an omp state type you can specialize with.

Ok, you are right. I think there are possibilities with SFINAE and enable_if magix, but this might be overkill :)

— Reply to this email directly or view it on GitHub https://github.com/BoostGSoC/odeint-v2/issues/5#issuecomment-19532057.

neapel commented 11 years ago

openmp_state<T> is now an alias for std::vector<std::vector<T>>, the number of splits to create can be forced by using the normal constructors to initialize a number of elements. The actual splitting happens in odeint::copy, copying from a vector<T> to openmp_state<T> splits the data, if the openmp_state<T> has no size, the number of threads is used. Copying from openmp_state<T> to vector<T> joins the data again.