Open akunft opened 8 years ago
Numeric
type boundelements()
that allows for element-wise aggregationsI am still not sure on the Numeric
type bound for matrices but I think for a start this should be fine.
Do we also allow aggregations over Vectors? I think this is important, e.g. for Vector-norms.
:+1: for aggregation over all elements. Most APIs offer aggregations specified by the dimensions (1 = rows, 2 = columns, default = all elements)
I would also not fix the traversal order and allow only commutative aggregation operations.
For the rest I will think about it some more!
Yes, we do allow aggregations over vectors, as shown in the example above.
I also agree on the aggregation over elements, but I would keep the methods separated in cols()
, rows()
to do per-vector aggregations and an additional elements()
method to do aggregation over the elements of a matrix.
Changes for the API are now tracked in PR #191
@stratosphere/emma-committers If nobody has objections, I would allow +,-,*,/ (single char only) as method names in the scala style formatter, for the methods in matrix/vector.
:+1:
This issue should be used to discuss the user facing abstraction for the matrix and vector type.
The initial prototype and ongoing effort is tracked in PR #191.
The focus is on the traits
Matrix
andVector
.In the following, I want to highlight some of the more special parts of the abstraction open for discussion:
Type bounds for the generic value
Currently, we use
spire.Numeric
as type bound for the values in Matrix/Vector. This allows us to use all basic operations like +, -, * and / (which is not supported by the scalaNumeric
) and there are implicit conversions for all the numeric primitives in scala.Instead, we could define our own type as bound for the values, to support e.g. Strings, boolean , ... similar to
spire.Field
. As this would generalize the abstraction, we also had to implement the implicit conversions spire gives us for free.I would suggest to keep the Numeric bound for now and see if there is need for a wider bound.
Aggregations
Currently we allow aggregations over vectors only. This enables the user to define his own aggregation functions. In combination with the
columns()
androws()
method, the user can define aggregations of the columns and rows of a matrix.A point to mention is that the return type is dependent on the result of the traversal.
Open questions:
cols()
androws
methods. Array, Traversable, DataBag[(Int, A)]?Default implementation
The current implementations are based on one dimensional arrays. Therefore it is easy to hand the execution of operators to netlib-java easily (not yet done).