llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.43k stars 11.75k forks source link

[OpenMP][NVPTX] nested std::array causes fatal error: error in backend #67919

Open tom91136 opened 1 year ago

tom91136 commented 1 year ago

Possibly related to https://github.com/llvm/llvm-project/issues/55361.

#include <array>
int main() {
#pragma omp target teams distribute parallel for
  for (int i = 0; i < 1; ++i) {
    [](int) {
      std::array<std::array<float, 1>, 1> transform = {};
      transform[0][0] = 0;
      for (int l = 0; l < 1; l++) {
      }
    }(i);
  }
  return 0;
}
> clang++ -std=c++17 -O2 -fopenmp=libomp --offload-arch=sm_60 --cuda-path=$CUDA_PATH this_file.cpp
 fatal error: error in backend: Cannot select: 0xa26b5e0: i64,ch = stacksave 0xaa2e0a0
 In function: __omp_offloading_26_35250d8_main_l3
 clang++: error: clang frontend command failed with exit code 70 (use -v to see invocation)
 clang version 18.0.0 (git@github.com:tom91136/llvm-project.git 96adadf8f7227f6543537056f27f98cb18bbe8ce)
 Target: x86_64-unknown-linux-gnu
 Thread model: posix
 InstalledDir: /home/tom/software/llvm-ompt/96adadf8f722/bin

If we change std::array<std::array<float, 1>, 1> transform = {}; to std::array<std::array<float, 1>, 2> transform = {};, the issue goes away and the binary seems to be OK. The empty for loop in the lambda and the placement of transform[0][0] = 0; plays a role here.

This is reduced from https://github.com/UoB-HPC/miniBUDE/blob/v2/src/std-indices/fasten.hpp, and using a different constant (>=8 in our case) on the transform array doesn't trigger the problem.

Issue is also reproducible on CUDA 11.8 targeting sm_35. Only NVPTX is affected; targeting gfx906 works as expected.

Diagnostics: diag.zip

llvmbot commented 1 year ago

@llvm/issue-subscribers-openmp

Possibly related to https://github.com/llvm/llvm-project/issues/55361. ```cpp #include <array> int main() { #pragma omp target teams distribute parallel for for (int i = 0; i < 1; ++i) { [](int) { std::array<std::array<float, 1>, 1> transform = {}; transform[0][0] = 0; for (int l = 0; l < 1; l++) { } }(i); } return 0; } ``` ```shell > clang++ -std=c++17 -O2 -fopenmp=libomp --offload-arch=sm_60 --cuda-path=$CUDA_PATH this_file.cpp fatal error: error in backend: Cannot select: 0xa26b5e0: i64,ch = stacksave 0xaa2e0a0 In function: __omp_offloading_26_35250d8_main_l3 clang++: error: clang frontend command failed with exit code 70 (use -v to see invocation) clang version 18.0.0 (git@github.com:tom91136/llvm-project.git 96adadf8f7227f6543537056f27f98cb18bbe8ce) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /home/tom/software/llvm-ompt/96adadf8f722/bin ``` If we change `std::array<std::array<float, 1>, 1> transform = {};` to `std::array<std::array<float, 1>, 2> transform = {};`, the issue goes away and the binary seems to be OK. The empty for loop in the lambda and the placement of `transform[0][0] = 0;` plays a role here. This is reduced from https://github.com/UoB-HPC/miniBUDE/blob/v2/src/std-indices/fasten.hpp, and using a different constant (>=8 in our case) on the transform array doesn't trigger the problem. Issue is also reproducible on CUDA 11.8 targeting sm_35. Only NVPTX is affected; targeting gfx906 works as expected. Diagnostics: [diag.zip](https://github.com/llvm/llvm-project/files/12777430/diag.zip)