NEMO has some helper routines with loops that are called from multiple places in the call stack, for performance reasons I want to run them sometimes on the CPU and sometimes on the GPU, this is decided higher in the call chain.
One solution is using the OpenMP "if" clause with a global variable:
program test
use omp_lib
integer, parameter :: N = 1000
real, dimension(N) :: data
logical :: on_device = .true.
call cpu_routine()
call cpu_routine()
call cpu_routine()
call gpu_routine()
call gpu_routine()
call gpu_routine()
contains
subroutine gpu_routine()
call helper()
end subroutine gpu_routine
subroutine cpu_routine()
integer :: store_device
! Use the host device from here
on_device = .false.
call helper()
! Restore GPU device from here
on_device = .true.
end subroutine cpu_routine
subroutine helper()
integer :: i
!$omp target if(on_device)
!$omp loop
do i = 1, N
data(i) = i / 3.14
end do
!$omp end loop
!$omp end target
end subroutine helper
end program test
I also tried other approaches that don't need to create the global (e.g. setting and restoring the "omp_default_device"), because in reality the subroutines are in different translation units, but they didn't work - nvfortran does not create a host device that I can select dynamically. So unless somebody has an idea about how can I do the same without the global symbol, I will implement the if clause and put the global in a known location imported by all files.
NEMO has some helper routines with loops that are called from multiple places in the call stack, for performance reasons I want to run them sometimes on the CPU and sometimes on the GPU, this is decided higher in the call chain.
One solution is using the OpenMP "if" clause with a global variable:
I also tried other approaches that don't need to create the global (e.g. setting and restoring the "omp_default_device"), because in reality the subroutines are in different translation units, but they didn't work - nvfortran does not create a host device that I can select dynamically. So unless somebody has an idea about how can I do the same without the global symbol, I will implement the if clause and put the global in a known location imported by all files.