clMathLibraries / clBLAS

a software library containing BLAS functions written in OpenCL
Apache License 2.0
839 stars 240 forks source link

clblasDaxpy with "incr" input argument set to zero #35

Open SciCed opened 10 years ago

SciCed commented 10 years ago

I'm trying to perform a sum between a scalar and a matrix using clblasDaxpy. To perform that, I set to zero the increment of the scalar but clblasDaxpy does not accept this value.

daxpy of Ref BLAS and cuBLAS allow this use, I think that clblasDaxpy should allow this use also.

Is it possible to develop this feature in clBLAS ?

kknox commented 10 years ago

I am seeing in the doxygen documentation for clblasDaxpy the following: * @param[in] incx Increment for the elements of \b X. Must not be zero.

I can't recall the technical reason for this right now, but the documentation is explicit. So, I don't think we will be able to fix this anytime soon. We should at least document the reason why the inc parameters can't be 0, so I will keep digging.

Can you post a small, test case? Otherwise, I can craft one up later.

SciCed commented 10 years ago

Hi kknox,

Here is a test case. http://pastebin.com/3QJCCXjw

Thanks for replying, Ced

kknox commented 10 years ago

My company is blocking pastebin; could you repost the snippet as a public gist and link?

SciCed commented 10 years ago

I had not seen this github feature. https://gist.github.com/CedScilab/74947ed341f96d066868

dfdx commented 8 years ago

I can't recall the technical reason for this right now

I came across this issue when googling why incx cannot be zero. And for those like me here's a tip from this page:

The last argument incx is the increment. Usually, incx=1 and the vector x corresponds directly to the one-dimensional Fortran array x. For incx>1 it specifies how many elements in the array we should "jump" between each element of the vector x. E.g. if incx=2 it means we should only scale every other element (note: the physical dimension of the array x should then be at least 2n-1).

So incx is not an increment to the elements of a buffer, but instead increment to indexes of the buffer where operations should be applied. You can apply operation to every 1st element (all elements), every 2nd, every 3rd, etc., but obviously you cannot apply it to every 0th element.

As a conclusion, if you just want usual behavior similar to BLAS standard, always set incx to 1.

hughperkins commented 8 years ago

@dfdx essentially, inc seems semantically identical to stride. A stride of 0, along one or more dimensions, can be valid, in some libraries, if you want to repeat the same values along those dimensions, which seems to be what CedScilab wants to do. The use-cases of this are similar to, or at least a subset of, matlab's repmat function.

SciCed commented 8 years ago

@dfdx The sentence you have posted is about SSCAL routine, not DAXPY. In fact, in the SSCAL case, it is useless to set incx to zero when we can perform a * x where "a" and "x" has scalar.

however, in my case, I need it, because I want to write the addition (for Scilab) using DAXPY of clBLAS: one of the use is to perform a + B where "a" is scalar and "B" is a vector. ie: [1] + [1, 2, 3, ...] The idea is to set the inc of "a" to zero to avoid the allocation of data which will be filled with the same value.

dfdx commented 8 years ago

@CedScilab Indeed, I didn't know about such possibility and misinterpreted your question.

Though, I'm curious if simple custom kernel will work for you?

SciCed commented 8 years ago

Yes, I can write something in OpenCL to perform that. But, if this feature is directly in clBLAS, the performance will be certainly better and the integration in my development is already done.