Open ddemidov opened 11 years ago
Have you had a look at the eigen3 template library:
http://eigen.tuxfamily.org/index.php?title=Main_Page
I believe they decided to go with option 1.
Hello Dominic,
I think that eigen choose option 3 actually: their matrices are compile-time convertible to arrays. Then, operator*() has linear algebra meaning for matrices and elementwise meaning for arrays.
I actually like this option more than the two above.
Ah, yes. You're right. That way you get the best of both worlds.
Eigen's arrays have only 2 dimensions though, I'd copy boost::multi_array
, which allows an arbitrary number of dimensions, inherit vector with dims=1
and matrix with dims=2
and overloaded operator*
. That way, if you want 2D arrays, but element-wise *
, just use the multi_array type with dims=2 directly, or down-cast a matrix.
Hi Pascal,
What operations would you support for multidimensional arrays? Are there any having sense for multidimensional matrices, except for elementwise arithmetics, element access and probably slicing (I am not sure about the possibility of general slicing with multi-gpu support though)?
Right now I am thinking of inheriting vex::matrix
from vex::vector
, and then overloading some of default operations like multiplication. This way elementwise operations would be directly and transparently supported.
Ah, the number of dimensions is compile-time property of boost::multi_array. That makes things easier.
Tensor products could generate multi-dimensional arrays. The tensor product of a d1 dimensional array with a d2 dimensional array is a d1+d2 dimensional array. Contractions reduce the dimensionality of an array. Reshaping could also be supported.
I think most of these would have to be implemented as standalone functions.
I think vex::multi_array
should be single-device-only. This way we may easily introduce slicing operations, transpositions, etc.
What do you think of the following interface:
std::vector<size_t> dims = {100, 200, 300};
// Construction:
vex::multi_array<double> A(ctx, dims);
vex::multi_array<double> B(ctx, {100, 200});
// Slicing should return multi_array (proxy) of reduced dimensions. No data
// copies should be made; slicing should only affect how the data is indexed
// inside compute kernels:
vex::multi_array::slice s(_, _, 42);
B = A[s];
B = A[slice(_, _, 42)] + A[slice(_, _, 100)];
// or, may be
B = A.slice(_, _, 42);
// An alternative to matlab's (start:stride:end) syntax:
slice s(range(0, 2, 100), _, 1);
// Linear algebra operations, like matrix-matrix or matrix-vector products
// should be supported. Element-wise operations could be achieved by converting
// multi_array to vex::vector:
B.vec() += sin( A[slice(_,_,1)].vec() );
Edit: This is just speculation for now, because I don't currently have time to implement this (may be later this summer).
Maybe using
vex::multi_array<T, dim-1> vex::multi_array<T,dim>::operator[](size_t)
vex::multi_array<T, dim> vex::multi_array<T,dim>::operator[](_)
like boost::multi_array does would make the code more compact:
vex::multi_array<double,3> A(ctx, {200, 200, 200});
B = A/*i=0, d=3*/[_]/*i=1, d=3*/[_]/*i=2, d=3*/[42]/*d=2*/;
B = A[_][_][42] + A[100]/*d=2*/;
B = A[_][100][_];
B[10] = A[1][2];
double b = B[10][20];
Yes, I like this better.
I think the repa library in haskell also has lots of good ideas on how to do multi-dimensional arrays in a type safe manner:
http://hackage.haskell.org/package/repa
http://www.haskell.org/haskellwiki/Numeric_Haskell:_A_Repa_Tutorial
The overall idea is rather similar to boost::multi_array. Arrays are represented by a flat chunk of memory that holds the data and a type that represents the shape of the array. There are functions that just manipulate the shape, i.e. just type manipulations. What they're really strong at is loop fusion optimizations.
I'd love to contribute to this but I'm pretty swamped at work at the moment. At a minimum I could write tests though.
Here is an example of a dense matrix-matrix product implementation:
vex::vector<double> A(ctx, n1 * n2);
vex::vector<double> B(ctx, n2 * n3);
vex::vector<double> C(ctx, n1 * n3);
// C = A * B
C = vex::reduce<vex::SUM>(
vex::extents[n1][n2][n3],
vex::reshape(A, vex::extents[n1][n2][n3], vex::extents[0][1])
*
vex::reshape(B, vex::extents[n1][n2][n3], vex::extents[1][2]),
1
);
This of course is equivalent to a naive algorithm; no tiling or shared memory tricks are involved.
BTW Denis, clMagma is an OpenCL implementation of LAPACK/BLAS, If we do take EIgen, then we can port clMagma into that (along with ViennaCL) and hence into VexCL. I don't think it would be prudent to come up with another dense matrix interface and reinventing the wheel?
I still don't understand what do you mean by If we do take Eigen
. VexCL is not built on top of Eigen, and VexCL can not interface with Eigen (except for copying data back and forth through raw pointers).
Another step to better dense matrix support is 3e452b7. It implements a tensordot() operation, borrowed from python's numpy.tensordot. The dense matrix-matrix product implemented above with reshape/reduce combination becomes
vex::slicer<2> Adim(vex::extents[n1][n2]);
vex::slicer<2> Bdim(vex::extents[n2][n3]);
C = vex::tensordot(Adim[_](A), Bdim[_](B), vex::axes_pairs(1, 0));
which is about 30 times faster on my K40c (performance test is given here).
By the way, for your reference, there has been lately a lot of development in Eigen for multi-dimensional tensors: https://bitbucket.org/eigen/eigen/src/373864b2f85df38592d4d0a5be1d4ef3064d5554/unsupported/Eigen/CXX11/src/Tensor/?at=default
Providing dense matrix support for VexCL would probably be of use for many people, but currently it is unclear what is the best approach to do this.
Options are:
vex::matrix<>
and makevex::vector<>
a special case of a dense matrix with first dimension equal to one. This would not work very well, because some operations (likeoperator*()
) have different default meaning from what currently is used.vex::matrix<>
completely unrelated type with its own grammar (set of operators). This should be similar to distinction that Eigen does between matrices and arrays. This looks more promising and should not break current functionality.