alpaka-group / vikunja

Vikunja is a performance portable algorithm library that defines functions operating on ranges of elements for a variety of purposes . It supports the execution on multi-core CPUs and various GPUs. Vikunja uses alpaka to implement platform-independent primitives such as reduce or transform.
https://vikunja.readthedocs.io/en/latest/
Mozilla Public License 2.0
14 stars 5 forks source link

add executor object #76

Open SimeonEhrig opened 2 years ago

SimeonEhrig commented 2 years ago

The executor should be a container object, which contains all types and objects, which are required to launch a vikunja algorithm. His task is it to simplify the API.

Possible implementation for the basics:

executor.hpp

template<typename TAcc, typename TDev, typename TQueue>
class Executor {
public:
     using Acc = typename TAcc;

     TDev dev;
     TQueue queue;

     Executor(TDev dev, TQueue queue) dev(dev), queue(queue) {}
}

reduce.hpp

// fully compatible with alpaka types
template<Typename TAcc, typename TDev, typename TQueue>
reduce(TQueue queue, buffer input, int sum, Func functor){
   // actual reduce algorithm 
}

template<typename TAcc, typename TDev, typename TQueue>
reduce(Executor<TAcc, TDev, TQueue> executor, buffer input, int sum, Func functor){
  reduce<Executor<TAcc, TDev, TQueue>::Acc>(executor.queue, input, sum, functor);
}

main.cpp


// ...

reduce(executor, input, sum, reduce_func);
// instead
// reduce<Acc>(queue, input, sum, reduce_func);

The complete definition and properties are not fixed yet. It should be still possible to use all possibilities of alpaka. The executor object should only simplify the default case.

SimeonEhrig commented 2 years ago

cc @j-stephan @psychocoderHPC

SimeonEhrig commented 2 years ago

TODO: view Sycl API

j-stephan commented 2 years ago

SYCL is a bit more low-level than vikunja (it doesn't provide host-side algorithms like reduce). What @psychocoderHPC was talking about today is the distinction SYCL makes between different objects. Memory is separate from queues and queues are separate from devices... they are not combined into god-like objects.

I like your proposed API. For reference, this is how the C++ executor proposal looks like:

executor auto ex = /* ... */;

std::reduce(std::execution::par.on(ex), /* first */, /* last */, /* init */, /* op */);
std::transform(std::execution::par.on(ex), /* first */, /* last */, /* out */, /* op */);

I believe we can achieve something similar with vikunja executors.