JuliaORNL / JACC.jl

CPU/GPU parallel performance portable layer in Julia via functions as arguments
MIT License
19 stars 9 forks source link

AMDGPU test for JACC.BLAS fails #91

Open pedrovalerolara opened 3 months ago

pedrovalerolara commented 3 months ago

Although JACC.BLAS works well when using a Julia terminal, but it fails when running the AMDGPU JACC.BLAS test (see output below). More work is needed. The JACC.BLAS module is now part of JACC, but the JACC.BLAS test code for the AMDGPU backend is commented.

JACC.BLAS: Error During Test at /home/wfg/github-runners/cousteau-JACC/ci/_work/JACC.jl/JACC.jl/test/tests_amdgpu.jl:100 Got exception outside of a @test GPU Kernel Exception Stacktrace: [1] error(s::String) @ Base ./error.jl:35 [2] throw_if_exception(dev::AMDGPU.HIP.HIPDevice) @ AMDGPU ~/.julia/packages/AMDGPU/BhNdC/src/exception_handler.jl:123 [3] synchronize(stm::AMDGPU.HIP.HIPStream*** blocking::Bool, stop_hostcalls::Bool) @ AMDGPU ~/.julia/packages/AMDGPU/BhNdC/src/highlevel.jl:53 [4] synchronize (repeats 2 times) @ ~/.julia/packages/AMDGPU/BhNdC/src/highlevel.jl:49 [inlined] [5] parallel_for(::Int64, ::typeof(JACC.BLAS._axpy), ::Float64, ::Vararg{Any}) @ JACCAMDGPU ~/github-runners/cousteau-JACC/ci/_work/JACC.jl/JACC.jl/ext/JACCAMDGPU/JACCAMDGPU.jl:12 [6] axpy(n::Int64, alpha::Float64, x::AMDGPU.ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}, y::AMDGPU.ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}) @ JACC.BLAS ~/github-runners/cousteau-JACC/ci/_work/JACC.jl/JACC.jl/src/JACCBLAS.jl:14 [7] macro expansion @ ~/github-runners/cousteau-JACC/ci/_work/JACC.jl/JACC.jl/test/tests_amdgpu.jl:125 [inlined] [8] macro expansion @ /auto/software/swtree/ubuntu22.04/x86_64/julia/1.9.1/share/julia/stdlib/v1.9/Test/src/Test.jl:1498 [inlined] [9] top-level scope @ ~/github-runners/cousteau-JACC/ci/_work/JACC.jl/JACC.jl/test/tests_amdgpu.jl:102 [10] include(fname::String) @ Base.MainInclude ./client.jl:478 [11] top-level scope @ ~/github-runners/cousteau-JACC/ci/_work/JACC.jl/JACC.jl/test/runtests.jl:15 [12] include(fname::String) @ Base.MainInclude ./client.jl:478 [13] top-level scope @ none:6 [14] eval @ ./boot.jl:370 [inlined] [15] exec_options(opts::Base.JLOptions) @ Base ./client.jl:280 [16] _start() @ Base ./client.jl:522 Test Summary: | Error Total Time JACC.BLAS | 1 1 1.9s

ygtangg commented 1 month ago

Unlike Pedro's issue, this is not within the same test. Experiencing the same issue implementing the Basic Hartree-Fock proxy application using JACC. Getting the following error for an input greater than 4 atoms:

GPU Kernel Exception
  Stacktrace:
    [1] error(s::String)
      @ Base ./error.jl:35
    [2] throw_if_exception(dev::AMDGPU.HIP.HIPDevice)
      @ AMDGPU ~/.julia/packages/AMDGPU/gtxsf/src/exception_handler.jl:123
    [3] synchronize(stm::AMDGPU.HIP.HIPStream; blocking::Bool, stop_hostcalls::Bool)
      @ AMDGPU ~/.julia/packages/AMDGPU/gtxsf/src/highlevel.jl:53
    [4] synchronize (repeats 2 times)
      @ ~/.julia/packages/AMDGPU/gtxsf/src/highlevel.jl:49 [inlined]
    [5] parallel_for(::Int64, ::typeof(BasicHFProxy._jacc_kernel_threaded_atomix!), ::AMDGPU.ROCArray{Float64, 2, AMDGPU.Runtime.Mem.HIPBuffer}, ::Vararg{Any})
      @ JACCAMDGPU ~/.julia/packages/JACC/CPpH7/ext/JACCAMDGPU/JACCAMDGPU.jl:17
    [6] bhfp_jacc(inputfile::String; verbose::Bool)
      @ BasicHFProxy /autofs/nccsopen-svm1_home/y1e/BasicHFProxy.jl/src/jacc.jl:74
    [7] bhfp_jacc
      @ /autofs/nccsopen-svm1_home/y1e/BasicHFProxy.jl/src/jacc.jl:18 [inlined]
    [8] macro expansion
      @ /autofs/nccsopen-svm1_sw/odo/julia-1.10.4/share/julia/stdlib/v1.10/Test/src/Test.jl:669 [inlined]
    [9] macro expansion
      @ /autofs/nccsopen-svm1_home/y1e/BasicHFProxy.jl/test/runtests_jacc.jl:11 [inlined]
   [10] macro expansion
      @ /autofs/nccsopen-svm1_sw/odo/julia-1.10.4/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
   [11] macro expansion
      @ /autofs/nccsopen-svm1_home/y1e/BasicHFProxy.jl/test/runtests_jacc.jl:9 [inlined]
   [12] macro expansion
      @ /autofs/nccsopen-svm1_sw/odo/julia-1.10.4/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
   [13] top-level scope
      @ /autofs/nccsopen-svm1_home/y1e/BasicHFProxy.jl/test/runtests_jacc.jl:6

Using AMDGPU v0.8.11