USCqserver / OpenQuantumBase.jl

Abstract types and math operations for OpenQuantumTools.jl.
https://uscqserver.github.io/OpenQuantumTools.jl/stable/
MIT License
6 stars 2 forks source link

Optimizing Hamiltonian constructor for GPU acceleration #41

Open naezzell opened 3 years ago

naezzell commented 3 years ago

In a standard anneal, a user will use the standard_driver function https://github.com/USCqserver/OpenQuantumBase.jl/blob/c567d618ad6b67da39ca424ea9e3db720ea3ea99/src/matrix_util.jl#L104-L110 This generates a matrix of type Array{Complex{Float64},2}. While we've shown that casting this as a CuArray, i.e.cu(standard_driver(n)), is sufficient for a speed-up, this is not optimal. Ideally, the GPU should only deal with Float32s, and perhaps even better, with real numbers only.

Furthermore, the DenseHamiltonian constructor performs "scalar operations" by indexing the m array (see https://github.com/USCqserver/OpenQuantumBase.jl/blob/c567d618ad6b67da39ca424ea9e3db720ea3ea99/src/hamiltonian/dense_hamiltonian.jl#L31-L48 This can be turned off with CUDA.allowscalar(false) and CuArray.allowscalar(false) or something like this.

Questions/ things to resolve: 1.) Does converting matrices to Array{Complex{Float32},2} before casting as CuArray help GPU performance? If so, add this support. 2.) Is there any speed to be gained by converting complex numbers to two reals numbers instead of Complex type? Does CUDA handle that for us? 3.) Does CUDA.allowscalar(false) actually help us? If not, is there a way to remove scalar operations from DenseHamiltonian constructor in the first place so that scalar operations don't occur on GPU?

SuperElephant commented 3 years ago

I tried to disable scalar in try_gpu_accel.jl (by just adding CUDA.allowscalar(false) in line 64.). However, it shows up that is not so trivial as adding a single line. It gives errors as follows. I suppose some changes for function DenseHamiltonian are needed in order to disable scalar.

ERROR: LoadError: scalar getindex is disallowed
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] assertscalar(::String) at /home1/chaoxian/.julia/packages/GPUArrays/uaFZh/src/host/indexing.jl:41
 [3] getindex at /home1/chaoxian/.julia/packages/GPUArrays/uaFZh/src/host/indexing.jl:96 [inlined]
 [4] macro expansion at /home1/chaoxian/.julia/packages/StaticArrays/l7lu2/src/convert.jl:46 [inlined]
 [5] unroll_tuple at /home1/chaoxian/.julia/packages/StaticArrays/l7lu2/src/convert.jl:43 [inlined]
 [6] _convert at /home1/chaoxian/.julia/packages/StaticArrays/l7lu2/src/convert.jl:35 [inlined]
 [7] convert at /home1/chaoxian/.julia/packages/StaticArrays/l7lu2/src/convert.jl:32 [inlined]
 [8] StaticArrays.SArray{Tuple{8,8},T,2,L} where L where T(::CuArray{Complex{Float64},2}) at /home1/chaoxian/.julia/packages/StaticArrays/l7lu2/src/convert.jl:7
 [9] (::OpenQuantumBase.var"#180#182"{Symbol,Tuple{Int64,Int64}})(::CuArray{Complex{Float64},2}) at /home1/chaoxian/.julia/packages/OpenQuantumBase/YkEiX/src/base_util.jl:0
 [10] iterate at ./generator.jl:47 [inlined]
 [11] collect(::Base.Generator{Array{CuArray{Complex{Float64},2},1},OpenQuantumBase.var"#180#182"{Symbol,Tuple{Int64,Int64}}}) at ./array.jl:665
 [12] DenseHamiltonian(::Array{Function,1}, ::Array{CuArray{Complex{Float64},2},1}; unit::Symbol, EIGS::typeof(EIGEN_DEFAULT)) at /home1/chaoxian/.julia/packages/OpenQuantumBase/YkEiX/src/hamiltonian/dense_hamiltonian.jl:41
 [13] anneal_spin_glass_gpu(::Int64, ::Int64) at /home1/chaoxian/final_project/accelqat/cuda/try_gpu_accel_ds.jl:58
 [14] top-level scope at ./util.jl:175
 [15] include(::Module, ::String) at ./Base.jl:377
 [16] exec_options(::Base.JLOptions) at ./client.jl:288
 [17] _start() at ./client.jl:484
in expression starting at /home1/chaoxian/final_project/accelqat/cuda/try_gpu_accel_ds.jl:67

Just to remind, as shown in stacktrace, line 41 in dense_hamiltonian.jl will need changes. Also there might be more changes needed other then that as mentioned by @naezzell .

https://github.com/USCqserver/OpenQuantumBase.jl/blob/c567d618ad6b67da39ca424ea9e3db720ea3ea99/src/hamiltonian/dense_hamiltonian.jl#L41