Closed bjarthur closed 12 hours ago
curious that info
for cublasSgetrfBatched
is a CuPtr{Cint}
(code), whereas that for cublasSgetrsBatched
is a Ptr{Cint}
(code). that's the source of the problem above, but fixing it i now i get:
WARNING: Error while freeing DeviceMemory(12 bytes at 0x0000000402000600):
CUDA.CuError(code=CUDA.cudaError_enum(0x000002bc))
Thanks for the PR!
curious that info for cublasSgetrfBatched is a CuPtr{Cint} (code), whereas that for cublasSgetrsBatched is a Ptr{Cint} (code).
According to https://docs.nvidia.com/cuda/cublas/index.html?highlight=cublasSgetrsBatched#cublas-t-getrsbatched, info
is host memory, so should be a CPU pointer.
ok, so info
is a scalar not vector, and B
needs to be 3D not 2. with those changes my MWE above works. will write some tests soon...
@bjarthur Ping me when the tests are ready.
@amontoison ready for review
i should note that i tried to add tests for no pivoting, but it is numerically unstable and so it was hard to make sure the output was correct. getrf
does not test for pivot=false
either.
uses
getrf_batched
as a template.also corrects README to indicate support for
spmv
andspr
.currently getting an error that i'm having trouble fixing: