I implemented the first step towards GPU interfaces.
This provides simple device handles
NSL::Device
NSL::GPU (a specialization of 1)
NSL::CPU (a specialization of 1)
which can already handle a device ID.
The Tensor class got a basic constructor
NSL::Tensor<double> A(2,2,NSL::GPU());
as well as a synchronization method (which copies data to the device)
auto Acpu = A.to(NSL::CPU());
or
auto Acpu = A.to(NSL::CPU(),/*non_blocking = */ true);
The synchronization has an example in Executables/Examples/example_GPU_Tensor (maybe we should change the name?)
This interface is additional and does not break previous commits.
I implemented the first step towards GPU interfaces. This provides simple device handles
NSL::Device
NSL::GPU
(a specialization of 1)NSL::CPU
(a specialization of 1)which can already handle a device ID. The Tensor class got a basic constructor
as well as a synchronization method (which copies data to the device)
The synchronization has an example in
Executables/Examples/example_GPU_Tensor
(maybe we should change the name?) This interface is additional and does not break previous commits.