Closed shakedregev closed 11 months ago
I believe this is the only mention of Cholesky in the entire package and it looks like it's coded to work this way (just copy to CPU and factorize).
That's incorrect; cholesky on dense inputs works fine:
julia> using CUDA, LinearAlgebra
julia> CUDA.allowscalar(false)
julia> A = CUDA.rand(10, 10);
julia> A = A*A'+I;
julia> cholesky(A)
Cholesky{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}
U factor:
10×10 UpperTriangular{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}:
2.07792 1.04945 1.31206 1.20788 1.16905 1.25365 1.19039 0.792913 1.1288 1.07848
⋅ 1.794 0.943456 0.393756 0.655787 0.66578 0.537175 0.213479 0.737071 0.155549
⋅ ⋅ 1.36309 0.394237 0.370576 0.603764 0.570361 0.414203 0.541527 0.262899
⋅ ⋅ ⋅ 1.38618 -0.0552797 0.39563 0.190204 0.00359829 0.141378 0.322178
⋅ ⋅ ⋅ ⋅ 1.53461 0.564346 0.130046 0.114851 0.20222 0.332031
⋅ ⋅ ⋅ ⋅ ⋅ 1.54097 0.301906 0.191423 0.368203 0.00617357
⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 1.21205 0.350676 0.424222 0.0375677
⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 1.32711 0.19541 -0.0467459
⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 1.36569 -0.0209019
⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 1.1683
A = CuSparseMatrixCSR(Q) cholA = cholesky(A)
What CUSOLVER/CUSPARSE API calls do you expect this to use?
There is a wrapper for solving a sparse system with Cholesky factorization here: https://github.com/JuliaGPU/CUDA.jl/blob/5c51766d0a9e7819ea79f314e37ed6a8a5d24369/lib/cusolver/sparse.jl#L102-L135
It seems that this solver does what cholesky!
and ldiv!
would do together and hence the factorization part cannot be wrapped by the Julia interface separately.
Also, according to the CUDA documentation here, there is no such thing like cholesky!
for sparse matrices from their high-level interface.
Thanks, but this is not useful in my case. I'm not sure why, but NVIDIA removed the ability to keep your factorizations. The move is good for simplicity when you're only solving one system, but not in many applications such as optimization problems, or conjugate gradient with a preconditioner. Almost anything where you're iterating. Back when I wrote in C++, I just went and used older CUDA versions and they worked fine for me. The newer ones did not have what I needed. It seems (perhaps) that CUDA.jl followed the same move by NVIDIA.
I am solving a system Ax=b
, where A
is dense and is just given as an operator (product and sum of sparse matrices). I am forming a sparse approximation of A
called M
which I want to factorize and solve systems with.
I need to be able to keep the factorization to solve multiple systems (because I am using this as a preconditioner in conjugate gradient). I also need the symbolic information, because I have multiple systems that look like Ax=b
but the sparsity structure of M
is constant.
Basically, I need to be able to do three separate things
This may or may not be useful, but for a matrix M
with csr representation (rows, cols, vals), the C++ style CUDA code for what I want to do is
// Symbolic factorization - happens once
csrcholInfo_t info = NULL;
cusolverSpCreateCsrcholInfo(&info);
cusolverSpXcsrcholAnalysis(handle_cusolver, M->n, nnzM, descrM, rows, cols, info);
size_t internalDataInBytes, workspaceInBytes;
cusolverSpDcsrcholBufferInfo(handle_cusolver, M->n, nnzM, descrM, vals, rows,
cols, info, &internalDataInBytes, &workspaceInBytes);
void* buffer_gpu = NULL;
cudaMalloc(&buffer_gpu, sizeof(char) * workspaceInBytes);
// Numerical factorization - happens for every different `Ax=b` system for the different `M`
cusolverSpDcsrcholFactor(
handle_cusolver, M->n, nnzM, descrM, vals, rows, cols, info, buffer_gpu);
cusolverSpDcsrcholZeroPivot(handle_cusolver, info, tol, &singularity);
// Solve that happens every conjugate gradient iteration
`cusolverSpDcsrcholSolve(handle_cusolver, M->n, x, b, info, buffer_gpu);`
Actually, an incomplete factorization would probably suffice for my purposes, but ic02
returns a matrix that I cannot make heads or tails of. It's not lower triangular or symmetric, so not sure how it incorporates the LL^T
factors
Although still I would argue that a full factorization should be doable. In many cases a full Cholesky factorization stays sparse.
@maleadt would you be willing to develop this functionality together so it can be part of the library? I can take care of the linear algebra, I just don't have experience with developing backend Julia functions (I do have experience with CUDA though)
I'm totally unfamiliar with what you're trying to accomplish, so I can only offer high-level guidance. Since you know which API calls you want to perform; what we normally do first, is build a slightly more generic wrapper around these APIs, e.g., https://github.com/JuliaGPU/CUDA.jl/blob/b3c6be4b3ec79d939a3e3bf4937a39147250a8a0/lib/cusolver/sparse.jl#L101-L135
Then, we use these wrappers to write methods that integrate with interfaces/stdlibs/packages like LinearAlgebra.jl, e.g., https://github.com/JuliaGPU/CUDA.jl/blob/b3c6be4b3ec79d939a3e3bf4937a39147250a8a0/lib/cusolver/linalg.jl#L481-L485
Thanks, but this is not useful in my case. I'm not sure why, but NVIDIA removed the ability to keep your factorizations. The move is good for simplicity when you're only solving one system, but not in many applications such as optimization problems, or conjugate gradient with a preconditioner. Almost anything where you're iterating. Back when I wrote in C++, I just went and used older CUDA versions and they worked fine for me. The newer ones did not have what I needed. It seems (perhaps) that CUDA.jl followed the same move by NVIDIA.
I am solving a system
Ax=b
, whereA
is dense and is just given as an operator (product and sum of sparse matrices). I am forming a sparse approximation ofA
calledM
which I want to factorize and solve systems with. I need to be able to keep the factorization to solve multiple systems (because I am using this as a preconditioner in conjugate gradient). I also need the symbolic information, because I have multiple systems that look likeAx=b
but the sparsity structure ofM
is constant. Basically, I need to be able to do three separate things1. Take the symbolic factorization of a csr matrix 2. Given this symbolic factorization, compute a numeric factorization 3. Solve systems with said numeric factorization.
This may or may not be useful, but for a matrix
M
with csr representation (rows, cols, vals), the C++ style CUDA code for what I want to do is// Symbolic factorization - happens once csrcholInfo_t info = NULL; cusolverSpCreateCsrcholInfo(&info); cusolverSpXcsrcholAnalysis(handle_cusolver, M->n, nnzM, descrM, rows, cols, info); size_t internalDataInBytes, workspaceInBytes; cusolverSpDcsrcholBufferInfo(handle_cusolver, M->n, nnzM, descrM, vals, rows, cols, info, &internalDataInBytes, &workspaceInBytes); void* buffer_gpu = NULL; cudaMalloc(&buffer_gpu, sizeof(char) * workspaceInBytes); // Numerical factorization - happens for every different `Ax=b` system for the different `M` cusolverSpDcsrcholFactor( handle_cusolver, M->n, nnzM, descrM, vals, rows, cols, info, buffer_gpu); cusolverSpDcsrcholZeroPivot(handle_cusolver, info, tol, &singularity); // Solve that happens every conjugate gradient iteration `cusolverSpDcsrcholSolve(handle_cusolver, M->n, x, b, info, buffer_gpu);`
@shakedregev Using cusolverSpDcsrcholFactor
etc. for exposing sparse cholesky factorizations and solves might be problematic since they appear to be undocumented, low-level API [1,2], and I cannot tell if they will be stable or maintained.
It might be likely they won't be supported since the CUDA component of CHOLMOD from SuiteSparse seems to be the officially supported sparse Cholesky library for CUDA. I guess this would require depending on SuiteSparse.
@maleadt I would like to contribute, but I imagine this particular direction would be more involved because of the dependence of SuiteSparse. Would you be open to contributions in this direction, and offering some high-level guidance?
It might be likely they won't be supported since the CUDA component of CHOLMOD from SuiteSparse seems to be the officially supported sparse Cholesky library for CUDA. I guess this would require depending on SuiteSparse.
@maleadt I would like to contribute, but I imagine this particular direction would be more involved because of the dependence of SuiteSparse. Would you be open to contributions in this direction, and offering some high-level guidance?
Sure, happy to offer some guidance. I'd think that this functionality would need to be a part of SuiteSparse.jl, ideally as a package extension that gets loaded when CUDA.jl is available. We've already built some artifacts, https://github.com/JuliaPackaging/Yggdrasil/tree/master/S/SuiteSparse/SuiteSparse_GPU%405, but they aren't used.
@shakedregev I interfaced the low-level API for the sparse QR and sparse Cholesky decompositions in #2121.
@amontoison - Thank you!
Can this be closed then, or is there more to this issue?
CHOLMOD support for CUDA is very likely to be implemented as a package extension to the standard library / SuiteSparse (https://github.com/JuliaSparse/SparseArrays.jl/issues/443). So, I think there should not be more to this issue and can be closed as completed.
Describe the bug
When you factorize a GPU matrix, the output is on a CPU. I believe https://github.com/JuliaGPU/CUDA.jl/blob/6d2e9ab2f2a1bb1df7791bb30178e0fe956940a3/lib/cusolver/linalg.jl#L484 is the only mention of Cholesky in the entire package and it looks like it's coded to work this way (just copy to CPU and factorize). This is not desired behavior, as it makes the entire GPU operation pointless.
To reproduce
The Minimal Working Example (MWE) for this bug:
Output
Manifest.toml
``` # This file is machine-generated - editing it directly is not advised julia_version = "1.9.0-alpha1" manifest_format = "2.0" project_hash = "fe1c204c663e26329f68ccc2911a2d13920bea4e" [[deps.AbstractFFTs]] deps = ["ChainRulesCore", "LinearAlgebra"] git-tree-sha1 = "69f7020bd72f069c219b5e8c236c1fa90d2cb409" uuid = "621f4979-c628-5d54-868e-fcf4e3e8185c" version = "1.2.1" [[deps.Adapt]] deps = ["LinearAlgebra"] git-tree-sha1 = "0310e08cb19f5da31d08341c6120c047598f5b9c" uuid = "79e6a3ab-5dfb-504d-930d-738a2a938a0e" version = "3.5.0" [[deps.ArgTools]] uuid = "0dad84c5-d112-42e6-8d28-ef12dabb789f" version = "1.1.1" [[deps.Artifacts]] uuid = "56f22d72-fd6d-98f1-02f0-08ddc0907c33" [[deps.BFloat16s]] deps = ["LinearAlgebra", "Printf", "Random", "Test"] git-tree-sha1 = "dbf84058d0a8cbbadee18d25cf606934b22d7c66" uuid = "ab4f0b2a-ad5b-11e8-123f-65d77653426b" version = "0.4.2" [[deps.Base64]] uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f" [[deps.CEnum]] git-tree-sha1 = "eb4cb44a499229b3b8426dcfb5dd85333951ff90" uuid = "fa961155-64e5-5f13-b03f-caf6b980ea82" version = "0.4.2" [[deps.CUDA]] deps = ["AbstractFFTs", "Adapt", "BFloat16s", "CEnum", "CUDA_Driver_jll", "CUDA_Runtime_Discovery", "CUDA_Runtime_jll", "CompilerSupportLibraries_jll", "ExprTools", "GPUArrays", "GPUCompiler", "LLVM", "LazyArtifacts", "Libdl", "LinearAlgebra", "Logging", "Preferences", "Printf", "Random", "Random123", "RandomNumbers", "Reexport", "Requires", "SparseArrays", "SpecialFunctions"] git-tree-sha1 = "edff14c60784c8f7191a62a23b15a421185bc8a8" uuid = "052768ef-5323-5732-b1bb-66c8b64840ba" version = "4.0.1" [[deps.CUDA_Driver_jll]] deps = ["Artifacts", "JLLWrappers", "LazyArtifacts", "Libdl", "Pkg"] git-tree-sha1 = "75d7896d1ec079ef10d3aee8f3668c11354c03a1" uuid = "4ee394cb-3365-5eb0-8335-949819d2adfc" version = "0.2.0+0" [[deps.CUDA_Runtime_Discovery]] deps = ["Libdl"] git-tree-sha1 = "58dd8ec29f54f08c04b052d2c2fa6760b4f4b3a4" uuid = "1af6417a-86b4-443c-805f-a4643ffb695f" version = "0.1.1" [[deps.CUDA_Runtime_jll]] deps = ["Artifacts", "CUDA_Driver_jll", "JLLWrappers", "LazyArtifacts", "Libdl", "Pkg", "TOML"] git-tree-sha1 = "d3e6ccd30f84936c1a3a53d622d85d7d3f9b9486" uuid = "76a88914-d11a-5bdc-97e0-2f5a05c973a2" version = "0.2.3+2" [[deps.ChainRulesCore]] deps = ["Compat", "LinearAlgebra", "SparseArrays"] git-tree-sha1 = "c6d890a52d2c4d55d326439580c3b8d0875a77d9" uuid = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4" version = "1.15.7" [[deps.ChangesOfVariables]] deps = ["ChainRulesCore", "LinearAlgebra", "Test"] git-tree-sha1 = "844b061c104c408b24537482469400af6075aae4" uuid = "9e997f8a-9a97-42d5-a9f1-ce6bfc15e2c0" version = "0.1.5" [[deps.Compat]] deps = ["Dates", "LinearAlgebra", "UUIDs"] git-tree-sha1 = "61fdd77467a5c3ad071ef8277ac6bd6af7dd4c04" uuid = "34da2185-b29b-5c13-b0c7-acf172513d20" version = "4.6.0" [[deps.CompilerSupportLibraries_jll]] deps = ["Artifacts", "Libdl"] uuid = "e66e0078-7015-5450-92f7-15fbd957f2ae" version = "1.0.1+0" [[deps.Dates]] deps = ["Printf"] uuid = "ade2ca70-3891-5945-98fb-dc099432e06a" [[deps.DocStringExtensions]] deps = ["LibGit2"] git-tree-sha1 = "2fb1e02f2b635d0845df5d7c167fec4dd739b00d" uuid = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae" version = "0.9.3" [[deps.Downloads]] deps = ["ArgTools", "FileWatching", "LibCURL", "NetworkOptions"] uuid = "f43a241f-c20a-4ad4-852c-f6b1247861c6" version = "1.6.0" [[deps.ExprTools]] git-tree-sha1 = "56559bbef6ca5ea0c0818fa5c90320398a6fbf8d" uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04" version = "0.1.8" [[deps.FileIO]] deps = ["Pkg", "Requires", "UUIDs"] git-tree-sha1 = "7be5f99f7d15578798f338f5433b6c432ea8037b" uuid = "5789e2e9-d7fb-5bc7-8068-2c6fae9b9549" version = "1.16.0" [[deps.FileWatching]] uuid = "7b1f6079-737a-58dc-b8bc-7a2ca5c1b5ee" [[deps.GPUArrays]] deps = ["Adapt", "GPUArraysCore", "LLVM", "LinearAlgebra", "Printf", "Random", "Reexport", "Serialization", "Statistics"] git-tree-sha1 = "4dfaff044eb2ce11a897fecd85538310e60b91e6" uuid = "0c68f7d7-f131-5f86-a1c3-88cf8149b2d7" version = "8.6.2" [[deps.GPUArraysCore]] deps = ["Adapt"] git-tree-sha1 = "57f7cde02d7a53c9d1d28443b9f11ac5fbe7ebc9" uuid = "46192b85-c4d5-4398-a991-12ede77f4527" version = "0.1.3" [[deps.GPUCompiler]] deps = ["ExprTools", "InteractiveUtils", "LLVM", "Libdl", "Logging", "TimerOutputs", "UUIDs"] git-tree-sha1 = "95185985a5d2388c6d0fedb06181ad4ddd40e0cb" uuid = "61eb1bfa-7361-4325-ad38-22787b887f55" version = "0.17.2" [[deps.InteractiveUtils]] deps = ["Markdown"] uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240" [[deps.InverseFunctions]] deps = ["Test"] git-tree-sha1 = "49510dfcb407e572524ba94aeae2fced1f3feb0f" uuid = "3587e190-3f89-42d0-90ee-14403ec27112" version = "0.1.8" [[deps.IrrationalConstants]] git-tree-sha1 = "7fd44fd4ff43fc60815f8e764c0f352b83c49151" uuid = "92d709cd-6900-40b7-9082-c6be49f344b6" version = "0.1.1" [[deps.JLLWrappers]] deps = ["Preferences"] git-tree-sha1 = "abc9885a7ca2052a736a600f7fa66209f96506e1" uuid = "692b3bcd-3c85-4b1f-b108-f13ce0eb3210" version = "1.4.1" [[deps.Krylov]] deps = ["LinearAlgebra", "Printf", "SparseArrays"] path = "/home/shakedregev/.julia/dev/Krylov" uuid = "ba0b0d4f-ebba-5204-a429-3ac8c609bfb7" version = "0.9.0" [[deps.LLVM]] deps = ["CEnum", "LLVMExtra_jll", "Libdl", "Printf", "Unicode"] git-tree-sha1 = "df115c31f5c163697eede495918d8e85045c8f04" uuid = "929cbde3-209d-540e-8aea-75f648917ca0" version = "4.16.0" [[deps.LLVMExtra_jll]] deps = ["Artifacts", "JLLWrappers", "LazyArtifacts", "Libdl", "Pkg", "TOML"] git-tree-sha1 = "771bfe376249626d3ca12bcd58ba243d3f961576" uuid = "dad2f222-ce93-54a1-a47d-0025e8a3acab" version = "0.0.16+0" [[deps.LazyArtifacts]] deps = ["Artifacts", "Pkg"] uuid = "4af54fe1-eca0-43a8-85a7-787d91b784e3" [[deps.LibCURL]] deps = ["LibCURL_jll", "MozillaCACerts_jll"] uuid = "b27032c2-a3e7-50c8-80cd-2d36dbcbfd21" version = "0.6.3" [[deps.LibCURL_jll]] deps = ["Artifacts", "LibSSH2_jll", "Libdl", "MbedTLS_jll", "Zlib_jll", "nghttp2_jll"] uuid = "deac9b47-8bc7-5906-a0fe-35ac56dc84c0" version = "7.84.0+0" [[deps.LibGit2]] deps = ["Base64", "NetworkOptions", "Printf", "SHA"] uuid = "76f85450-5226-5b5a-8eaa-529ad045b433" [[deps.LibSSH2_jll]] deps = ["Artifacts", "Libdl", "MbedTLS_jll"] uuid = "29816b5a-b9ab-546f-933c-edad1886dfa8" version = "1.10.2+0" [[deps.Libdl]] uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb" [[deps.LinearAlgebra]] deps = ["Libdl", "OpenBLAS_jll", "libblastrampoline_jll"] uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" [[deps.LogExpFunctions]] deps = ["ChainRulesCore", "ChangesOfVariables", "DocStringExtensions", "InverseFunctions", "IrrationalConstants", "LinearAlgebra"] git-tree-sha1 = "680e733c3a0a9cea9e935c8c2184aea6a63fa0b5" uuid = "2ab3a3ac-af41-5b50-aa03-7779005ae688" version = "0.3.21" [[deps.Logging]] uuid = "56ddb016-857b-54e1-b83d-db4d58db5568" [[deps.Markdown]] deps = ["Base64"] uuid = "d6f4376e-aef5-505a-96c1-9c027394607a" [[deps.MbedTLS_jll]] deps = ["Artifacts", "Libdl"] uuid = "c8ffd9c3-330d-5841-b78e-0817d7145fa1" version = "2.28.0+0" [[deps.MozillaCACerts_jll]] uuid = "14a3606d-f60d-562e-9121-12d972cd8159" version = "2022.10.11" [[deps.NPZ]] deps = ["FileIO", "ZipFile"] git-tree-sha1 = "60a8e272fe0c5079363b28b0953831e2dd7b7e6f" uuid = "15e1cf62-19b3-5cfa-8e77-841668bca605" version = "0.4.3" [[deps.NetworkOptions]] uuid = "ca575930-c2e3-43a9-ace4-1e988b2c1908" version = "1.2.0" [[deps.OpenBLAS_jll]] deps = ["Artifacts", "CompilerSupportLibraries_jll", "Libdl"] uuid = "4536629a-c528-5b80-bd46-f80d51c5b363" version = "0.3.21+0" [[deps.OpenLibm_jll]] deps = ["Artifacts", "Libdl"] uuid = "05823500-19ac-5b8b-9628-191a04bc5112" version = "0.8.1+0" [[deps.OpenSpecFun_jll]] deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Pkg"] git-tree-sha1 = "13652491f6856acfd2db29360e1bbcd4565d04f1" uuid = "efe28fd5-8261-553b-a9e1-b2916fc3738e" version = "0.5.5+0" [[deps.Pkg]] deps = ["Artifacts", "Dates", "Downloads", "FileWatching", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "Serialization", "TOML", "Tar", "UUIDs", "p7zip_jll"] uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f" version = "1.8.0" [[deps.Preferences]] deps = ["TOML"] git-tree-sha1 = "47e5f437cc0e7ef2ce8406ce1e7e24d44915f88d" uuid = "21216c6a-2e73-6563-6e65-726566657250" version = "1.3.0" [[deps.Printf]] deps = ["Unicode"] uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7" [[deps.REPL]] deps = ["InteractiveUtils", "Markdown", "Sockets", "Unicode"] uuid = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb" [[deps.Random]] deps = ["SHA", "Serialization"] uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" [[deps.Random123]] deps = ["Random", "RandomNumbers"] git-tree-sha1 = "7a1a306b72cfa60634f03a911405f4e64d1b718b" uuid = "74087812-796a-5b5d-8853-05524746bad3" version = "1.6.0" [[deps.RandomNumbers]] deps = ["Random", "Requires"] git-tree-sha1 = "043da614cc7e95c703498a491e2c21f58a2b8111" uuid = "e6cf234a-135c-5ec9-84dd-332b85af5143" version = "1.5.3" [[deps.Reexport]] git-tree-sha1 = "45e428421666073eab6f2da5c9d310d99bb12f9b" uuid = "189a3867-3050-52da-a836-e630ba90ab69" version = "1.2.2" [[deps.Requires]] deps = ["UUIDs"] git-tree-sha1 = "838a3a4188e2ded87a4f9f184b4b0d78a1e91cb7" uuid = "ae029012-a4dd-5104-9daa-d747884805df" version = "1.3.0" [[deps.SHA]] uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce" version = "0.7.0" [[deps.Serialization]] uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b" [[deps.Sockets]] uuid = "6462fe0b-24de-5631-8697-dd941f90decc" [[deps.SparseArrays]] deps = ["Libdl", "LinearAlgebra", "Random", "Serialization", "SuiteSparse_jll"] uuid = "2f01184e-e22b-5df5-ae63-d93ebab69eaf" [[deps.SpecialFunctions]] deps = ["ChainRulesCore", "IrrationalConstants", "LogExpFunctions", "OpenLibm_jll", "OpenSpecFun_jll"] git-tree-sha1 = "d75bda01f8c31ebb72df80a46c88b25d1c79c56d" uuid = "276daf66-3868-5448-9aa4-cd146d93841b" version = "2.1.7" [[deps.Statistics]] deps = ["LinearAlgebra", "SparseArrays"] uuid = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" version = "1.9.0" [[deps.SuiteSparse_jll]] deps = ["Artifacts", "Libdl", "Pkg", "libblastrampoline_jll"] uuid = "bea87d4a-7f5b-5778-9afe-8cc45184846c" version = "5.10.1+0" [[deps.TOML]] deps = ["Dates"] uuid = "fa267f1f-6049-4f14-aa54-33bafae1ed76" version = "1.0.3" [[deps.Tar]] deps = ["ArgTools", "SHA"] uuid = "a4e569a6-e804-4fa4-b0f3-eef7a1d5b13e" version = "1.10.0" [[deps.Test]] deps = ["InteractiveUtils", "Logging", "Random", "Serialization"] uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40" [[deps.TimerOutputs]] deps = ["ExprTools", "Printf"] git-tree-sha1 = "f2fd3f288dfc6f507b0c3a2eb3bac009251e548b" uuid = "a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f" version = "0.5.22" [[deps.UUIDs]] deps = ["Random", "SHA"] uuid = "cf7118a7-6976-5b1a-9a39-7adc72f591a4" [[deps.Unicode]] uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5" [[deps.ZipFile]] deps = ["Libdl", "Printf", "Zlib_jll"] git-tree-sha1 = "f492b7fe1698e623024e873244f10d89c95c340a" uuid = "a5390f91-8eb1-5f08-bee0-b1d1ffed6cea" version = "0.10.1" [[deps.Zlib_jll]] deps = ["Libdl"] uuid = "83775a58-1f1d-513f-b197-d71354ab007a" version = "1.2.13+0" [[deps.libblastrampoline_jll]] deps = ["Artifacts", "Libdl"] uuid = "8e850b90-86db-534c-a0d3-1478176c7d93" version = "5.2.0+0" [[deps.nghttp2_jll]] deps = ["Artifacts", "Libdl"] uuid = "8e850ede-7688-5339-a07c-302acd2aaf8d" version = "1.48.0+0" [[deps.p7zip_jll]] deps = ["Artifacts", "Libdl"] uuid = "3f19e933-33d8-53b3-aaab-bd5110c3b7a0" version = "17.4.0+0" ```
Expected behavior
I expect the factorization to stay on the GPU and the solution to also be on the GPU. Meaning
typeof(x)==typeof(b)
andtypeof(cholA)
should be something GPU related.Version info
Details on Julia:
Details on CUDA: