ParRes / Kernels

This is a set of simple programs that can be used to explore the features of a parallel platform.
https://groups.google.com/forum/#!forum/parallel-research-kernels
Other
404 stars 106 forks source link

hip platform run stencil occurs "Segementation fault(core dumped)" #638

Closed wangzy0327 closed 1 year ago

wangzy0327 commented 1 year ago

What type of issue is this?

Bug in the stencil code when code ran in hip platform

wzy@gxn1275:~/Kernels/Cxx11$ ./stencil-hip 1024 32
Parallel Research Kernels version 2.17
C++11/HIP Stencil execution on 2D grid
device name: AMD Instinct MI100
total global memory:     34342961152
max threads per block:   1024
max threads dim:         1024,1024,1024
max grid size:           2147483647,2147483647,2147483647
memory clock rate (KHz): 1200000
memory bus width (bits): 4096
Number of iterations = 1024
Grid size            = 32
Tile size            = 32
Type of stencil      = star
Radius of stencil    = 2
Segmentation fault (core dumped)

If this is a bug report, please use the following template. Otherwise, please delete the rest of the template.

Where does this bug appear?

Linux Ubuntu 20.04

Operating system

Linux gxn1275 5.16.0 #1 SMP PREEMPT Tue Sep 20 02:10:11 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Compiler

hipcc --version

HIP version: 5.4.22804-474e8620
AMD clang version 15.0.0 (https://github.com/RadeonOpenCompute/llvm-project roc-5.4.3 23045 a29fe425c7b0e5aba97ed2f95f61fd5ecba68aed)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/rocm-5.4.3/llvm/bin

PRK build information

hipcc -g --offload-arch=gfx908:xnack+:sramecc- -DTCE_HIP -fno-gpu-rdc stencil-hip.cc -o stencil-hip

Output showing problem

./stencil-hip 1024 32
Parallel Research Kernels version 2.17
C++11/HIP Stencil execution on 2D grid
device name: AMD Instinct MI100
total global memory:     34342961152
max threads per block:   1024
max threads dim:         1024,1024,1024
max grid size:           2147483647,2147483647,2147483647
memory clock rate (KHz): 1200000
memory bus width (bits): 4096
Number of iterations = 1024
Grid size            = 32
Tile size            = 32
Type of stencil      = star
Radius of stencil    = 2
Segmentation fault (core dumped)

@jeffhammond

jeffhammond commented 1 year ago

Your build command doesn't work, so you must be modifying the source. Please build using make hip after setting make.defs properly.

~/PRK/Cxx11> hipcc -g --offload-arch=gfx908:xnack+:sramecc- -DTCE_HIP -fno-gpu-rdc stencil-hip.cc -o stencil-hip
stencil-hip.cc:87:56: error: use of undeclared identifier 'PRKVERSION'
  std::cout << "Parallel Research Kernels version " << PRKVERSION << std::endl;
jeffhammond commented 1 year ago

xnack+ is the problem on my end. Without it, I see this error:

"hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"
jeffhammond commented 1 year ago

make.defs.hip is actually a good template for make.defs. Please use it.