kokkos / kokkos-kernels

Kokkos C++ Performance Portability Programming Ecosystem: Math Kernels - Provides BLAS, Sparse BLAS and Graph Kernels
Other
302 stars 96 forks source link

Setting cross-over parameters in kokkoskernels. #404

Open kyungjoo-kim opened 5 years ago

kyungjoo-kim commented 5 years ago

KokkosKernels provide native implementations and interface to third party libraries (TPLs). For a certain problem range, the native implementaions are fast and TPLs may work well for another problem range. For the level 1 operations which are typically used in many places in Trilions and application, we might need to consider how we choose implementations. For more complicated situations, we can also seek a more generic way to characterizing archtectures and optimization strategies.

This is a umbrella issue to discuss and record potential activities regarding to this matter.

ndellingwood commented 5 years ago

Cross-referencing PR #283, a first pass at looking into this for dot, abs, and spmv

mhoemmen commented 5 years ago

I'm not sure whether we should rely on TPLs for BLAS 1 stuff. TPLs may be faster, but if we write the code ourselves, we have the option to make the implementations safer (with respect to overflow -- see e.g., the reference BLAS' implementation of DNRM2) and more deterministic.