Open nmustakin opened 7 months ago
@llvm/issue-subscribers-openmp
Author: None (nmustakin)
The second argument should probably not be zero here. => We cannot allocate 0 bytes.
The R&R functionality isn't really tested. It's a global object that's only initialized on a single device so it's probably broken as well if you try to use more than one device.
This allocation error is caused directly by requesting 0 bytes of memory in the above mentioned call.
PluginInterface --> WARNING VA mapping failed, fallback to heuristic: (Error: Memory Map Size must be larger than 0)
Which is then rejected by the CUDA Device Plugin here: https://github.com/llvm/llvm-project/blob/1e82d506b0b2b4b8501bb1cae13d2e2f3405922d/offload/plugins-nextgen/cuda/src/rtl.cpp#L670
@llvm/issue-subscribers-offload
Author: None (nmustakin)
OpenMP offload recording is failing to allocate memory. It keeps requesting 0 bytes instead of the present
LIBOMPTARGET_RR_DEVMEM_SIZE
.For example when running
LIBOMPTARGET_DEBUG=1 LIBOMPTARGET_RR_DEVMEM_SIZE=4 LIBOMPTARGET_RR_SAVE_OUTPUT=1 OMP_TARGET_OFFLOAD=mandatory LIBOMPTARGET_NEXTGEN_PLUGINS=1 LIBOMPTARGET_RECORD=1 nvprof ./lulesh
the output shows -as well as -
ending with only 1 out of 17 kernels being recorded