NexGenAnalytics / kokkos-kernels

Kokkos C++ Performance Portability Programming EcoSystem: Math Kernels - Provides BLAS, Sparse BLAS and Graph Kernels
Other
0 stars 0 forks source link

task1: implementations #2

Open fnrizzi opened 2 years ago

fnrizzi commented 2 years ago

Scope

The objective would be do things like:

Kokkos:View<double**> A;
Kokkos::par_for(100, 
  KOKKOS_LAMBDA(int i)
  {
    auto Aslice = subview(A...);

    KokkosBlas::SerialGemv(... );
    KokkosBlas::SerialGemm(... );
    KokkosBlas::SerialScal(... );

    KokkosBlas::ThreadVectorGemv(... );
    KokkosBlas::ThreadVectorGemm(... );
  });

The key thing is to have these implemented inside the KokkosBlas namespace.

IMPORTANT: TeamVector from SOW actually means ThreadVector or Vector (something that can run under TeamThreadRange) - not to be confused with TeamVectorRange.

Note: "special" implementations, like atomic-updating gemm in BSR SpGEMM are not in scope of this task.

Update any relevant documentation - where exactly ? (e.g. Wiki)

Status

Serial Team / TeamVector ThreadVector
a) GEMV ✅Done #1433 ✅Done #1435 ⏳ Queued for review #1556
b) GEMM ✔ In review #1519 ✔ In review #1519 ⏳ Queued for review #1556
c) Scal ✅Done #1448 ✅ Done #1448 ⏳ Queued for review #1556
mzuzek commented 2 years ago

@lucbv @fnrizzi

PR Status

To activate PR links click or CTRL+click the chart to open it in new tab/window


[src](https://www.plantuml.com/plantuml/uml/jLDDRvmm5BpxLpoPynQowROgXofgLVLGvHRb4gtG2pp0nH-MRMZwxulPB908zRB9PP36OsRUsBxn0LtebMI-4zgYGmNEzBegZTFauB3YVOkL-IDq0tl679K1TIC9-CD36uBrUPes8hJzuuVIgBGpNMVyysVNaTCaVMfxATEhmzKNI28UZ_3pn-rj3ieC1C40vx-c3TyaUiuve8U6b2gczIJB4BFiCklz7fA89-lx76hn12rXHSwV_OPY98jdCjhMzQMsrzF4ryl4ZceiWt4G76flZISeZNeK6eCmsY-ZduL558vfwlhoD8SCaeZd4ZRIB5syHu2FOu28Snr2MKcAT03Exw9Fddyt-lVT76Tr6SfnuOrw-0H9nECnshQEiBgdE2ldIwBLWOH9CaoIDqhm6PA8Sn6-jzQu0BLneCGJLU1BbEWW_BLqIZ7HnObUpeCjtUnsNxVk3jlWO177DyQsJlm_lzlMldf1x4LjKJ57tLmQNXkz73YgrjlO3UcgL_8V) ---- Legend ![img](https://www.plantuml.com/plantuml/png/LSyn3i8m38NXFQTuIY-G8tHWOKDT695BdH3HEfNZKd5xQ5dAQEkJVxcjHchBD3gdV3UID39ynpIy4Oj2-PLvzQ0AtPTD2366SDHdER8ijz-qKQ8_EfEo69eOjnkRzgSYZ7unh9GjIzghnHwik7JLMK7Fc5WJ3uXPg_bZ-bhqYIoOjIJbMhrpBSlm_US3) [src](http://www.plantuml.com/plantuml/uml/LSyn3i8m38NXlQUmqWla21rOs50NHkJIbn3HEfNZKd5xKbXAfjZwf5_kYh7QEa-DucuiSGAv7Zo-AMGbnIjXSyIbVlTbwtq8tX_rMOuioUqtxPJvZowi4ACsrEVzhjqf2sXOZJPEbYH-gzaUKBhqfR5C0sY6HemKaXPpn_GrwxCMQT4S5QMufeECzPeF)

Merged PRs