Calling runtime calls in GPU target device region fails

Reporting a bug

[x] I have tried using the latest released version of Numba (most recent is visible in the change log (https://github.com/numba/numba/blob/master/CHANGE_LOG).
[x] I have included a self contained code sample to reproduce the problem. i.e. it's possible to run as 'python bug.py'.

We import OpenMP runtime calls using CFFI. This works for host code where the OpenMP runtime library is dynamically loaded. However, on GPU device the OpenMP runtime library is a statically linked bitcode. Using the same import mechanism fails with:

.egg/numba/core/base.py", line 1114, in add_dynamic_addr
    assert self.allow_dynamic_globals, "dyn globals disabled in this target"
AssertionError: Failed in nopython mode pipeline (step: Handle with contexts)

If we don't import, it fails with a typing error:

.egg/numba/core/dispatcher.py", line 423, in error_rewrite
    raise e.with_traceback(None)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: Handle with contexts)

I'm attaching examples for both cases: target-runtime-call.py.txt target-runtime-call-noimport.py.txt

Python-for-HPC / numbaWithOpenmp

Calling runtime calls in GPU target device region fails #4

Reporting a bug