ROCm / hip-tests

MIT License
30 stars 31 forks source link

[Issue]: Build issue for Nvidia platform #466

Open abagusetty opened 4 days ago

abagusetty commented 4 days ago

Problem Description

Build issues with Nvidia platform

Tried: CUDA 12.6.0 & 12.6.1 rocm: 6.2.4 (couldnt choose this version in the drop downs)

Operating System

SLES 15-SP5

CPU

AMD EPYC 7543 32-Core Processor

GPU

Nvidia A100

ROCm Version

ROCm 6.2.3

ROCm Component

HIP

Steps to Reproduce

/home/abagusetty/rocm/hip-tests/catch/./include/memcpy3d_tests_common.hh: In instantiation of ‘Memcpy3DDeviceToDeviceShell(F, hipStream_t)::<lambda(size_t, size_t, size_t)> [with bool should_synchronize = false; bool enable_peer_access = false; F = hipError_t (*)(std::variant<cudaPitchedPtr, cudaArray*>, cudaPos, std::variant<cudaPitchedPtr, cudaArray*>, cudaPos, cudaExtent, cudaMemcpyKind, CUstream_st*); size_t = long unsigned int]’:
/home/abagusetty/rocm/hip-tests/catch/./include/memcpy3d_tests_common.hh:226:17:   required from ‘struct Memcpy3DDeviceToDeviceShell(F, hipStream_t) [with bool should_synchronize = false; bool enable_peer_access = false; F = hipError_t (*)(std::variant<cudaPitchedPtr, cudaArray*>, cudaPos, std::variant<cudaPitchedPtr, cudaArray*>, cudaPos, cudaExtent, cudaMemcpyKind, CUstream_st*); hipStream_t = CUstream_st*]::<lambda(size_t, size_t, size_t)>’
/home/abagusetty/rocm/hip-tests/catch/./include/memcpy3d_tests_common.hh:226:12:   required from ‘void Memcpy3DDeviceToDeviceShell(F, hipStream_t) [with bool should_synchronize = false; bool enable_peer_access = false; F = hipError_t (*)(std::variant<cudaPitchedPtr, cudaArray*>, cudaPos, std::variant<cudaPitchedPtr, cudaArray*>, cudaPos, cudaExtent, cudaMemcpyKind, CUstream_st*); hipStream_t = CUstream_st*]’
/home/abagusetty/rocm/hip-tests/catch/unit/graph/hipGraphAddMemcpyNode.cc:82:75:   required from here
/home/abagusetty/rocm/hip-tests/catch/./include/memcpy3d_tests_common.hh:227:16: error: ‘__closure’ is not a constant expression
/home/abagusetty/rocm/hip-tests/catch/./include/utils.hh:90:1: error: ‘void PitchedMemoryVerify(T*, size_t, size_t, size_t, size_t, F) [with T = int; F = Memcpy3DDeviceToHostShell(F, hipStream_t) [with bool should_synchronize = false; F = hipError_t (*)(std::variant<cudaPitchedPtr, cudaArray*>, cudaPos, std::variant<cudaPitchedPtr, cudaArray*>, cudaPos, cudaExtent, cudaMemcpyKind, CUstream_st*); hipStream_t = CUstream_st*]::<lambda(size_t, size_t, size_t)>; size_t = long unsigned int]’, declared using local type ‘Memcpy3DDeviceToHostShell(F, hipStream_t) [with bool should_synchronize = false; F = hipError_t (*)(std::variant<cudaPitchedPtr, cudaArray*>, cudaPos, std::variant<cudaPitchedPtr, cudaArray*>, cudaPos, cudaExtent, cudaMemcpyKind, CUstream_st*); hipStream_t = CUstream_st*]::<lambda(size_t, size_t, size_t)>’, is used but never defined [-fpermissive]
 void PitchedMemoryVerify(T* const ptr, const size_t pitch, const size_t width, const size_t height,
 ^~~~~~~~~~~~~~~~~~~

Compile verbose output:

cd /home/abagusetty/rocm/hip-tests/build/catch_tests/unit/graph && /soft/compilers/rocm/6.2.4/clr-install//bin/hipcc -DKERNELS_PATH=\"/home/abagusetty/rocm/hip-tests/catch/kernels/\" -I/home/abagusetty/rocm/hip-tests/catch/external/Catch2 -I/home/abagusetty/rocm/hip-tests/catch/./include -I/home/abagusetty/rocm/hip-tests/catch/./kernels -I/soft/compilers/rocm/6.2.4/clr-install/include -I/home/abagusetty/rocm/hip-tests/catch/external/picojson --std=c++17 --extended-lambda -MD -MT catch_tests/unit/graph/CMakeFiles/GraphsTest1.dir/hipGraphAddMemcpyNode.cc.o -MF CMakeFiles/GraphsTest1.dir/hipGraphAddMemcpyNode.cc.o.d -o CMakeFiles/GraphsTest1.dir/hipGraphAddMemcpyNode.cc.o -c /home/abagusetty/rocm/hip-tests/catch/unit/graph/hipGraphAddMemcpyNode.cc

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

Output of hipconfig --full

HIP version: 6.2.41134-65d174c3e

==hipconfig
HIP_PATH           :/soft/compilers/rocm/6.2.4/clr-install
ROCM_PATH          :/soft/compilers/rocm/6.2.4/clr-install
HIP_COMPILER       :nvcc
HIP_PLATFORM       :nvidia
HIP_RUNTIME        :cuda
CPP_CONFIG         : -D__HIP_PLATFORM_NVCC__= -D__HIP_PLATFORM_NVIDIA__= -I/soft/compilers/rocm/6.2.4/clr-install/include -I/soft/compilers/cudatoolkit/cuda-12.6.0/include

== nvcc
CUDA_PATH          :/soft/compilers/cudatoolkit/cuda-12.6.0
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Fri_Jun_14_16:34:21_PDT_2024
Cuda compilation tools, release 12.6, V12.6.20
Build cuda_12.6.r12.6/compiler.34431801_0

== Environment Variables
PATH =/soft/xalt/3.0.2-202408282050/bin:/soft/compilers/cudatoolkit/cuda-12.6.0/bin:/soft/compilers/rocm/6.2.4/clr-install/bin:/opt/cray/pals/1.3.4/bin:/opt/cray/pe/mpich/8.1.28/ofi/gnu/12.3/bin:/opt/cray/pe/mpich/8.1.28/bin:/opt/cray/pe/craype/2.7.30/bin:/soft/spack/base/0.7.1/install/linux-sles15-x86_64/gcc-12.3.0/cmake-3.27.7-a435jtzvweeos2es6enirbxdjdqhqgdp/bin:/soft/spack/base/0.7.1/install/linux-sles15-x86_64/gcc-12.3.0/curl-8.4.0-2ztev25qvydhabvu4nbkrtn4opcvw5nl/bin:/soft/spack/base/0.7.1/install/linux-sles15-x86_64/gcc-12.3.0/nghttp2-1.57.0-ciat5hufbwpozo6vmqgxanucn2zwu6z4/bin:/soft/perftools/darshan/darshan-3.4.4/bin:/opt/cray/pe/perftools/23.12.0/bin:/opt/cray/pe/papi/7.0.1.2/bin:/opt/cray/libfabric/1.15.2.0/bin:/opt/clmgr/sbin:/opt/clmgr/bin:/opt/sgi/sbin:/opt/sgi/bin:/home/abagusetty/.local/bin:/usr/local/bin:/usr/bin:/bin:/opt/c3/bin:/dbhome/db2cat/sqllib/bin:/dbhome/db2cat/sqllib/adm:/dbhome/db2cat/sqllib/misc:/dbhome/db2cat/sqllib/gskit/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:/opt/pbs/bin:/sbin:/opt/cray/pe/bin
LD_LIBRARY_PATH=/soft/compilers/cudatoolkit/cuda-12.6.0/lib64:/soft/compilers/rocm/6.2.4/clr-install/lib:/soft/spack/base/0.7.1/install/linux-sles15-x86_64/gcc-12.3.0/curl-8.4.0-2ztev25qvydhabvu4nbkrtn4opcvw5nl/lib:/soft/spack/base/0.7.1/install/linux-sles15-x86_64/gcc-12.3.0/nghttp2-1.57.0-ciat5hufbwpozo6vmqgxanucn2zwu6z4/lib:/soft/perftools/darshan/darshan-3.4.4/lib:/opt/cray/pe/papi/7.0.1.2/lib64:/opt/cray/libfabric/1.15.2.0/lib64:/dbhome/db2cat/sqllib/lib64:/dbhome/db2cat/sqllib/lib64/gskit:/dbhome/db2cat/sqllib/lib32
HIP_PLATFORM=nvidia
CUDA_PATH=/soft/compilers/cudatoolkit/cuda-12.6.0
HIP_RUNTIME=cuda
CUDA_HOME=/soft/compilers/cudatoolkit/cuda-12.6.0
HIP_OTHER=/home/abagusetty/rocm/hipother
HIP_DIR=/home/abagusetty/rocm/hip
HIP_COMPILER=nvcc

== Linux Kernel
Hostname      :
sirius-uan-0002
Linux sirius-uan-0002 5.14.21-150500.55.49-default #1 SMP PREEMPT_DYNAMIC Sun Feb 11 17:48:15 UTC 2024 (36baf2f) x86_64 x86_64 x86_64 GNU/Linux
ppanchad-amd commented 3 days ago

Hi @abagusetty. Internal ticket is created to investigate your issue. Thanks!

zichguan-amd commented 1 day ago

Hi @abagusetty, unfortunately I can't repro this issue with ROCm 6.2.4 and cuda_12.0.r12.0/compiler.32267302_0 as well as cuda_12.6.r12.6/compiler.34841621_0. Can you upgrade to the latest cuda 12.6?

abagusetty commented 1 day ago

Hi @zichguan-amd I just tried again with ROCm 6.2.4 and CUDA 12.6.1 (Build cuda_12.6.r12.6/compiler.34714021_0) version and here are my steps that I missed in the above chain:

cmake ../catch -DHIP_COMPILER=nvcc -DHIP_PLATFORM=nvidia -DHIP_RUNTIME=cuda -DHIP_PATH=/soft/compilers/rocm/6.2.4/clr-install

Attached is the cmake configure and build log files cmake.log build.log

zichguan-amd commented 1 day ago

I see you are on Cray machines, and I suspect is an environment/setup issue. Can you build the failing test directly using hipcc with -v and HIPCC_VERBOSE=1 and check at which stage it fails?

cd /home/abagusetty/rocm/hip-tests/build/catch_tests/unit/graph && HIPCC_VERBOSE=1 /soft/compilers/rocm/6.2.4/clr-install//bin/hipcc -v -DKERNELS_PATH=\"/home/abagusetty/rocm/hip-tests/catch/kernels/\" -I/home/abagusetty/rocm/hip-tests/catch/external/Catch2 -I/home/abagusetty/rocm/hip-tests/catch/./include -I/home/abagusetty/rocm/hip-tests/catch/./kernels -I/soft/compilers/rocm/6.2.4/clr-install/include -I/home/abagusetty/rocm/hip-tests/catch/external/picojson --std=c++17 --extended-lambda -MD -MT catch_tests/unit/graph/CMakeFiles/GraphsTest1.dir/hipGraphAddMemcpyNode.cc.o -MF CMakeFiles/GraphsTest1.dir/hipGraphAddMemcpyNode.cc.o.d -o CMakeFiles/GraphsTest1.dir/hipGraphAddMemcpyNode.cc.o -c /home/abagusetty/rocm/hip-tests/catch/unit/graph/hipGraphAddMemcpyNode.cc

It should just invoke nvcc like

/usr/local/cuda/bin/nvcc  -Wno-deprecated-gpu-targets  -isystem /usr/local/cuda/include -isystem "/opt/rocm-6.2.4/include" -x cu  -v -DKERNELS_PATH=\"/home/rocm/hip-tests/catch/kernels/\" -I/home/rocm/hip-tests/catch/external/Catch2 -I/home/rocm/hip-tests/catch/./include -I/home/rocm/hip-tests/catch/./kernels -I/opt/rocm/include -I/home/rocm/hip-tests/catch/external/picojson --std=c++17 --extended-lambda -MD -MT catch_tests/unit/graph/CMakeFiles/GraphsTest1.dir/hipGraphAddMemcpyNode.cc.o -MF CMakeFiles/GraphsTest1.dir/hipGraphAddMemcpyNode.cc.o.d -o "CMakeFiles/GraphsTest1.dir/hipGraphAddMemcpyNode.cc.o" -c /home/rocm/hip-tests/catch/unit/graph/hipGraphAddMemcpyNode.cc

try if you can compile directly using nvcc