Kokkos C++ Performance Portability Programming Ecosystem: Math Kernels - Provides BLAS, Sparse BLAS and Graph Kernels
311
stars
98
forks
source link
blas(axpby): execution space instance semantics not honored #2434
Open
romintomasetti opened 1 day ago
Using
KokkosBlas::axpby
, I could easily get a broken code if both following conditions are met:The issue is clear if looking e.g. at the following line: https://github.com/kokkos/kokkos-kernels/blob/b3a4bdf6973dceed7715c2fdc5f9499af54af2d8/blas/src/KokkosBlas1_axpby.hpp#L109
Adding an
exec_space.fence()
before fetching the value of the rank-0 view fixes it.As a side node, I think you get potentially other issues, e.g. here https://github.com/kokkos/kokkos-kernels/blob/b3a4bdf6973dceed7715c2fdc5f9499af54af2d8/blas/src/KokkosBlas1_axpby.hpp#L137 where passing
exec_space
toKokkos::deep_copy
seems necessary.