investigate with hpctoolkit when synchronization is an issue with the larger problem sizes
Problem: I don't know hpctoolkit or how to use it to identify synchronization issues.
Done when:
I can show when synchronization becomes an issue on the following OpenMP programs which certainly has some synchronization overhead.
Critical section
#include <omp.h>
#include <stdio.h>
#include <stdlib.h>
volatile int blah;
int main(){
int T = 1000;
int N = 10000;
int* array = (int*) calloc(sizeof(int), N);
for( int t = 0; t < T; ++t ){
#pragma omp parallel for
for( int i = 0; i < N; ++i ){
#pragma omp critical
blah += array[i];
}
}
}
Reduction
#include <omp.h>
#include <stdio.h>
#include <stdlib.h>
volatile int blah;
int main(){
int T = 1000;
int N = 10000;
int* array = (int*) calloc(sizeof(int), N);
for( int t = 0; t < T; ++t ){
#pragma omp parallel for reduction(+:blah)
for( int i = 0; i < N; ++i ){
blah += array[i];
}
}
}
From April first email:
Problem: I don't know hpctoolkit or how to use it to identify synchronization issues.
Done when:
Critical section
Reduction