uwhpsc-2016 / lectures

Notes, slides, and code from the in-class lectures.
7 stars 21 forks source link

Parallelization of code with many subroutines/nested subroutines #23

Open bmva opened 8 years ago

bmva commented 8 years ago

Will you be covering parallelization of code which calls many levels of subroutines?

For example, say within main I've got a function called initialize_arrays that calls many different functions in order to set some initial values of some arrays, A, B, and C. And within these functions many other routines may be called. Say each function takes anywhere from 1-N input values in order to perform it's operations. and routine_1, routine_2, routine_3 are required to matrix A where routine_1 calls routine_1a, routine_1b, routine_1c, etc. At what levels do the parallel commands need to be implemented?

I am curious how the privatization commands should be implemented when filling out the arrays.

Sorry for the extreme generalization, let me know it if is not clear.

Thanks!

bmva commented 8 years ago

After giving this more thought, I suppose most of the deep subroutines will all be writing to local copies, so there shouldn't be too much of an issue...and most of the code is 'embarrassingly parallel', but are there any items you'd note to particularly watch out for?

cswiercz commented 8 years ago

Will you be covering parallelization of code which calls many levels of subroutines?

In OpenMP each thread will have the subroutines defined / allocated on the thread's own stack. (Instructions are data, too!) So this, in a sense, is no different from having a bunch of thread-specific stack-allocated information.

However, these function can still act on data that is shared amongst the different threads. Which data is private and which is shared is managed by the programmer.

When it comes to shared data, it may help to think about it this way: each thread gets a copy of a pointer...all of which point to the same location in memory.