Open karlrupp opened 11 years ago
I have a question concerning the scheduler :) Do you intend to just overload the operator= for the scheduler, and work with the trees from viennacl::generator::matrix_expression<>, or keep the things duplicated for now and do some kind of conversions from viennacl::matrix_expression<> to viennacl::generator::matrix_expression<> ?
I have to keep things duplicated, because CUDA and OpenMP still need to be supported.
Scheduler now working and used for getting fast GEMM kernels in. Further improvements along 1.5.x releases.
The integration for 1.6.0 was successful. Still some more potential for integrating also sparse operations and such, but this will follow later -> 1.7.0
Too disruptive for 1.x.y release series. Have to postpone this to 2.0.0.
For operations such as x = y + z; x = y - z; there are currently two separate kernels launched, leading to unnecessary memory transfers. Expression templates are not enough to resolve this, so we need a micro-scheduler for fusing operations and passing them on to a kernel generator facility.