ddemidov / vexcl

VexCL is a C++ vector expression template library for OpenCL/CUDA/OpenMP
http://vexcl.readthedocs.org
MIT License
701 stars 81 forks source link

VEX_FUNCTION w/ void return type #220

Closed agerlach closed 7 years ago

agerlach commented 7 years ago

I am writing some user-defined functions that need to simply update the value of an existing vector. Is it possible to update in place without the need to return a value? See the toy example below.

Current solution:

VEX_FUNCTION(float,updateInPlace, (float, a)(float, b),
                 return a + b;);

vex::vector<float>  y(ctx, n);
y = 0.;
vex::vector<float>  y2(ctx, n);
y2 = 10.f;
y = updateInPlace(y,y2);

Desired solution:

VEX_FUNCTION(void,updateInPlace, (float*, a)(float, b),
           *a = *a + b;);
vex::vector<float>  y(ctx, n);
y = 0.;
vex::vector<float>  y2(ctx, n);
y2 = 10.f;
updateInPlace(y,y2);

If you try to call a vex function without an lvalue the function is never added to the kernel for compilation.

Would there be any noticeable performance gains by doing this? If not, then there is no issue using the current solution. However, I would think that this significantly reduces the number of memory copies.

ddemidov commented 7 years ago

A void function is basically a kernel, so you can create a custom kernel for this. Or, if working with functions is easier/more suitable in your case, you can use vex::eval() function like this:

vex::eval(updateInPlace(y,y2));

See tests/eval.cpp for a couple of examples. As you can see from the examples there, it was created for dealing with atomic operations, but it should work with any void function.

And of course, if you want to update a vector in place, you have to get the vector by pointer and provide an index:

VEX_FUNCTION(void, append, (double*, x)(double, y)(int, i),
  x[i] += y;
  );
vex::eval(append(raw_pointer(x), y, vex::element_index()));

It will only work on single device, since raw_pointer() is restricted to single device contexts, so doing something like x += delta(y); would be more generic.

ddemidov commented 7 years ago

re effectiveness of the 'current' solution: you can use vex::tag() function to tag the vector to reduce the number of kernel parameters:

tag<1>(x) = delta(tag<1>(x), y);

or

auto X = tag<1>(x);
X = delta(X, y);

It then should not be any different from the variant with void function.

agerlach commented 7 years ago

Sorry for the delay in responding, but this solved my issue perfectly. Thanks again for VexCL and all your support.