Closed fnrizzi closed 3 years ago
@fnrizzi @MikolajZuzek
spmv
allows to compute y <- beta y + alpha A^{M} x
where M
is N, C, T, H (similar to *gemv)
Focusing on the case N and C, the computational routine is spmv_beta_no_transpose
in the file KokkosSparse_spmv_impl.hpp
.
There is a switch by template between the different Kokkos Kernels (Serial, OpenMP, GPU).
(And the same thing applies to the transpose / hermitian case).
If CuSparse is enabled, there is also a switch in KokkosSparse_spmv.hpp
.
In the branch for the PR #1 , I have been looking at the no-transpose case and added the template-based switch.
@uhetmaniuk ok, yes thanks! that is exactly what i was referring to.
x
, y,
for example:
typedef Kokkos::View<
typename XVector::const_value_type*,
typename KokkosKernels::Impl::GetUnifiedLayout<XVector>::array_layout,
typename XVector::device_type,
Kokkos::MemoryTraits<Kokkos::Unmanaged|Kokkos::RandomAccess> > XVector_Internal;
XVector_Internal x_i = x;
do you why this has to be done? @MikolajZuzek offered to look into some of details to understand why of some things.
Just to clarify, I know that is work in progress, so I am not pressuring anything. Just asking questions :)
KokkosSparse_spmv.hpp
).examples/wiki/sparse/KokkosSparse_wiki_spmv.cpp
. I have gone through several iterations. Right now I am trying to follow the steps in spmv_beta_no_transpose
(line 316 of KokkosSparse_spmv_impl.hpp
). I have been comparing timings for the serial case.spmv
(Line 69 of file KokkosSparse_spmv.hpp
).
I do not know why it is necessary..@fnrizzi @uhetmaniuk
I looked over KokkosSparse::spmv()
implementation structure and specializations, here's the overview of calls going from user interface (orange) down to raw implementations (cyan):
(editable version: kk-spmv.plantuml.txt)
Summary:
NoTranspose
, Conjugate
, Transpose
, ConjugateTranspose
;alpha
/ beta
= 0/1/-1/other;spmv_raw_openmp_no_transpose()
.control
param), it also gets omitted if selected mode (e.g. Conjugate
) is not supported by current (old) library version (see _src/sparse/KokkosSparsespmv.hpp:155-L177)@MikolajZuzek this looks great!
@uhetmaniuk I was speaking with @MikolajZuzek and was wondering about when various specializations are triggered? Did you already figure this out for the
spmv
? I think that concurrently with writing an implementation for the block spmv, we also need a tentative plan/design for when to select the various impls. For example, Luc said that even if we might be able to use some alreayd avail CUDA library, he wants a basic implementation using CUDA that does not rely on external libraries. So one question is when are the various specializations activated? what conditions?