QuantumBFS / CuYao.jl

CUDA extension for Yao.jl
https://yaoquantum.org
Other
35 stars 8 forks source link

BUG when increasing `nbatch`. #15

Closed frankwswang closed 1 year ago

frankwswang commented 5 years ago
julia> using CuYao

julia> c4 = concentrate(12, chain(4, chain(4, chain(4, chain(4, put(4, 2=>Ry(pi)))))), 8:11)
nqubits: 12
Concentrator: (8, 9, 10, 11)
└─ chain
   └─ chain
      └─ chain
         └─ chain
            └─ put on (2)
               └─ rot(Y gate, 3.141592653589793)

julia> c5 = concentrate(12, chain(4, chain(4, chain(4, chain(4, control(4, 4, (2,3)=>ConstGate.SWAP))))), 8:11)
nqubits: 12
Concentrator: (8, 9, 10, 11)
└─ chain
   └─ chain
      └─ chain
         └─ chain
            └─ control(4)
               └─ (2, 3) SWAP gate

julia> m = 4095
4095

julia> rand_state(12, nbatch = m) |> cu |> c4
ArrayReg{2000, Complex{Float64}, CuArray...}
    active qubits: 12/12

julia> rand_state(12, nbatch = m) |> cu |> c5
ArrayReg{2000, Complex{Float64}, CuArray...}
    active qubits: 12/12

julia> m2 = 4096
4096

julia> rand_state(12, nbatch = m2) |> cu |> c4
ERROR: CUDA error: invalid argument (code #1, ERROR_INVALID_VALUE)

julia> rand_state(12, nbatch = m2) |> cu |> c5
ERROR: CUDA error: invalid argument (code #1, ERROR_INVALID_VALUE)
GiggleLiu commented 5 years ago

Could you please show the memory usage of your GPU?

GiggleLiu commented 5 years ago

Confirmed as a bug, it launches to many blocks, which is 2^8*4096.

It could be fixed by modifying the thread, block decision function, like

@inline function CuYao.cudiv(x::Int, y::Int)
           max_threads = 512
           threads_x = min(max_threads, x)
           threads_y = min(max_threads ÷ threads_x, y)
           threads = (threads_x, threads_y)
           blocks = ceil.(Int, (x, y) ./ threads)
           threads, blocks
end

I will look into it and fix this issue one for all.

Thanks for your report.

GiggleLiu commented 1 year ago

This problem seems does not exist anymore.