ddemidov / vexcl

VexCL is a C++ vector expression template library for OpenCL/CUDA/OpenMP
http://vexcl.readthedocs.org
MIT License
700 stars 82 forks source link

Dense matrix support #31

Open ddemidov opened 11 years ago

ddemidov commented 11 years ago

Providing dense matrix support for VexCL would probably be of use for many people, but currently it is unclear what is the best approach to do this.

Options are:

  1. Provide vex::matrix<> and make vex::vector<> a special case of a dense matrix with first dimension equal to one. This would not work very well, because some operations (like operator*()) have different default meaning from what currently is used.
  2. Make vex::matrix<> completely unrelated type with its own grammar (set of operators). This should be similar to distinction that Eigen does between matrices and arrays. This looks more promising and should not break current functionality.
d-meiser commented 11 years ago

Have you had a look at the eigen3 template library:

http://eigen.tuxfamily.org/index.php?title=Main_Page

I believe they decided to go with option 1.

ddemidov commented 11 years ago

Hello Dominic,

I think that eigen choose option 3 actually: their matrices are compile-time convertible to arrays. Then, operator*() has linear algebra meaning for matrices and elementwise meaning for arrays.

I actually like this option more than the two above.

d-meiser commented 11 years ago

Ah, yes. You're right. That way you get the best of both worlds.

neapel commented 11 years ago

Eigen's arrays have only 2 dimensions though, I'd copy boost::multi_array, which allows an arbitrary number of dimensions, inherit vector with dims=1 and matrix with dims=2 and overloaded operator*. That way, if you want 2D arrays, but element-wise *, just use the multi_array type with dims=2 directly, or down-cast a matrix.

ddemidov commented 11 years ago

Hi Pascal,

What operations would you support for multidimensional arrays? Are there any having sense for multidimensional matrices, except for elementwise arithmetics, element access and probably slicing (I am not sure about the possibility of general slicing with multi-gpu support though)?

Right now I am thinking of inheriting vex::matrix from vex::vector, and then overloading some of default operations like multiplication. This way elementwise operations would be directly and transparently supported.

ddemidov commented 11 years ago

Ah, the number of dimensions is compile-time property of boost::multi_array. That makes things easier.

d-meiser commented 11 years ago

Tensor products could generate multi-dimensional arrays. The tensor product of a d1 dimensional array with a d2 dimensional array is a d1+d2 dimensional array. Contractions reduce the dimensionality of an array. Reshaping could also be supported.

ddemidov commented 11 years ago

I think most of these would have to be implemented as standalone functions.

ddemidov commented 11 years ago

I think vex::multi_array should be single-device-only. This way we may easily introduce slicing operations, transpositions, etc.

What do you think of the following interface:

std::vector<size_t> dims = {100, 200, 300};

// Construction:
vex::multi_array<double> A(ctx, dims);
vex::multi_array<double> B(ctx, {100, 200});

// Slicing should return multi_array (proxy) of reduced dimensions. No data
// copies should be made; slicing should only affect how the data is indexed
// inside compute kernels:
vex::multi_array::slice s(_, _, 42);
B = A[s];

B = A[slice(_, _, 42)] + A[slice(_, _, 100)];

// or, may be
B = A.slice(_, _, 42);

// An alternative to matlab's (start:stride:end) syntax:
slice s(range(0, 2, 100), _, 1);

// Linear algebra operations, like matrix-matrix or matrix-vector products
// should be supported. Element-wise operations could be achieved by converting
// multi_array to vex::vector:
B.vec() += sin( A[slice(_,_,1)].vec() );

Edit: This is just speculation for now, because I don't currently have time to implement this (may be later this summer).

neapel commented 11 years ago

Maybe using

vex::multi_array<T, dim-1> vex::multi_array<T,dim>::operator[](size_t)
vex::multi_array<T, dim> vex::multi_array<T,dim>::operator[](_)

like boost::multi_array does would make the code more compact:

vex::multi_array<double,3> A(ctx, {200, 200, 200});
B = A/*i=0, d=3*/[_]/*i=1, d=3*/[_]/*i=2, d=3*/[42]/*d=2*/;
B = A[_][_][42] + A[100]/*d=2*/;
B = A[_][100][_];
B[10] = A[1][2];
double b = B[10][20];
ddemidov commented 11 years ago

Yes, I like this better.

d-meiser commented 11 years ago

I think the repa library in haskell also has lots of good ideas on how to do multi-dimensional arrays in a type safe manner:

http://hackage.haskell.org/package/repa

http://www.haskell.org/haskellwiki/Numeric_Haskell:_A_Repa_Tutorial

The overall idea is rather similar to boost::multi_array. Arrays are represented by a flat chunk of memory that holds the data and a type that represents the shape of the array. There are functions that just manipulate the shape, i.e. just type manipulations. What they're really strong at is loop fusion optimizations.

I'd love to contribute to this but I'm pretty swamped at work at the moment. At a minimum I could write tests though.

ddemidov commented 10 years ago

Here is an example of a dense matrix-matrix product implementation:

vex::vector<double> A(ctx, n1 * n2);
vex::vector<double> B(ctx, n2 * n3);
vex::vector<double> C(ctx, n1 * n3);

// C = A * B
C = vex::reduce<vex::SUM>(
        vex::extents[n1][n2][n3],
        vex::reshape(A, vex::extents[n1][n2][n3], vex::extents[0][1])
        *
        vex::reshape(B, vex::extents[n1][n2][n3], vex::extents[1][2]),
        1
        );

This of course is equivalent to a naive algorithm; no tiling or shared memory tricks are involved.

ddemidov commented 10 years ago

Nice paper: A C++11 implementation of arbitrary-rank tensors for high-performance computing.

skn123 commented 10 years ago

BTW Denis, clMagma is an OpenCL implementation of LAPACK/BLAS, If we do take EIgen, then we can port clMagma into that (along with ViennaCL) and hence into VexCL. I don't think it would be prudent to come up with another dense matrix interface and reinventing the wheel?

ddemidov commented 10 years ago

I still don't understand what do you mean by If we do take Eigen. VexCL is not built on top of Eigen, and VexCL can not interface with Eigen (except for copying data back and forth through raw pointers).

ddemidov commented 9 years ago

Another step to better dense matrix support is 3e452b7. It implements a tensordot() operation, borrowed from python's numpy.tensordot. The dense matrix-matrix product implemented above with reshape/reduce combination becomes

vex::slicer<2> Adim(vex::extents[n1][n2]);
vex::slicer<2> Bdim(vex::extents[n2][n3]);
C = vex::tensordot(Adim[_](A), Bdim[_](B), vex::axes_pairs(1, 0));

which is about 30 times faster on my K40c (performance test is given here).

ilyapopov commented 8 years ago

By the way, for your reference, there has been lately a lot of development in Eigen for multi-dimensional tensors: https://bitbucket.org/eigen/eigen/src/373864b2f85df38592d4d0a5be1d4ef3064d5554/unsupported/Eigen/CXX11/src/Tensor/?at=default