Open jack-dunham opened 7 months ago
This needs more from the CUDA library. MWE:
using CUDA
A = zeros(Float32, N, N)
A[:,1] .= ones(N)
W_d = CuArray(W)
W_sparse = sparse(W)
A = CUDA.CUSPARSE.CuSparseMatrixCSC(sparse(A))
false .* A[1,:]
Tracking in https://github.com/JuliaGPU/CUDA.jl/issues/2209
Hi there, apologies for opening this and then ghosting.
In response to this https://github.com/JuliaGPU/CUDA.jl/issues/2209#issuecomment-1941992408 in the tracked issue, why are we necessarily trying to zero a vector of similar type to the noise matrix? For example, although the noise rate matrix may well by sparse, the vector of resulting noise increments may not be sparse. Would it not be more sensible to zero a vector of same type as u0
?
I am almost certainly misunderstanding something here.
Thanks in advance!
That's what's happening internal in the package that's failing.
Yeah I don't understand why rand_prototype
needs to be a vector of the similar type to noise_rate_prototype
because I don't understand what the purpose of rand_prototype
is.
The offending code is this: https://github.com/SciML/StochasticDiffEq.jl/blob/870e062edd2582a4d79b6c0dabe64510c9860c07/src/solve.jl#L298-L303
Line 299 will evaluate to false if u isa CuVector
. I am picturing a solution like this:
if noise_rate_prototype isa CuSparseMatrix
rand_prototype = CUDA.zeros(randEltype, size(noise_rate_prototype,2)
end
With the important caveat that I don't know what rand_prototype
does. I see it is used in the construction of WienerProcess
for certain algorithms and unused otherwise. I gave up trying to figure out what is purpose is within WienerProcess
!
Cheers.
because I don't understand what the purpose of rand_prototype is.
rand_prototype
is a prototype of what the random vector is supposed to be, i.e. size shape and type. In theory this should always be dense but we don't have a way of densifying it from the rows generically. You cannot CUDA.zeros since it needs to be generic type matching to a row of noise_rate_prototype
.
since it needs to be generic type matching to a row of
noise_rate_prototype
.
I am curious why this is the case? Should it not match the typeof u
rather than the type of the object you get if you slice the noise matrix? Then by inputting dense u::V
and sparse noise_rate_prototype::M
the integrator calls the method
*(mat::M, vec::V)
At least this is how I assumed it would work by reading the docs.
If we zero a sparse CUDA array for the random vector, surely we then run into a similar of problem of trying to add a CuVector
to a CuSparseVector
? Does this kernel exist in CUDA.jl
?
It needs to match the length of the row but the type only has to be compatible with computations of u
. If you are doing automatic differentiation, u
would be dual while rand_prototype
would want to stay non-dual.
As a "most case" solution, adapt(DiffEqBase.parameterless_type(u),zeros(randElType,size(noise_rate_prototype,2)))
probably works for this case as well, so maybe that branch can just be removed.
Describe the bug π
I am unsure if this is a bug, or simply lack of support but calling
solve
with anoise_rate_prototype
of typeCuSparseMatrixCSC
fails when constructing the prototype for noise increment (It would seem).Expected behavior
Matrices from
CUDA.CUSPARSE
can be used fornoise_rate_prototype
.Minimal Reproducible Example π
Error & Stacktrace β οΈ
Environment (please complete the following information):
using Pkg; Pkg.status()
using Pkg; Pkg.status(; mode = PKGMODE_MANIFEST)
versioninfo()
Additional context
CUDA.versioninfo()