Open MarcelKoch opened 3 years ago
Stealing a bit of MATLAB's notation, we could also add a (defaulted) parameter to compute_norm2 that for a nxm matrix specifies whether we want column norms (output 1xm), row norms (output nx1) or frobenius norm (output 1x1). With the Diagonal matrix, we already have a representation of row and column scaling we could use there. Do you have any use cases in mind for dot products of flattened matrices?
compute_norm2 can relay on the output size, so we do not need to introduce another parameter
I'm open to having such a parameter, although I'd personally prefer different function names. I would also change the signature (in the dense case) to
ValueType compute_norm2()
as I think that's more natural.
The dot product for matrices appears in some finite element computations, I lastly remember it from some elasticity problem. It should be noted that in these cases the matrices are quite small, n~10
. Also, this operation has a proper mathematical name (which I can't remember right now) and is denoted with A:B
.
I think add_scale and scale should be consider A x diag(alpha)? the diag(alpha) A is represented by Diagonal as @upsj said. dot should be the same behavior as MV behavior if we consider dot<A, B> = diag(A^T B) no matter it is MV or Dense. the operation you mention is like dot_multiple (.)
compute_norm2 can relay on the output size, so we do not need to introduce another parameter
Hm... while that is true, I am not sure whether that makes for readable/understandable code. For example
A->compute_norm2(result);
x->add_scaled(result, y);
What does this do? Does it scale all values of y
by the same value, or does it scale individual columns by different values? You can not tell from this piece of code alone, which might be an issue. Of course the same is true for add_scaled, but at least that does not propagate.
I guess we could add a zero-parameter overload for compute_norm2, which would then be unambiguous, but since all other functions write into Dense, that would be a bit of a break with our current patterns. I would only add it as an alternative, not as the only option.
How about
A->compute_norm2(output, reduction::rowwise);
A->compute_dot(output, reduction::colwise);
A->compute_norm_inf(output, reduction::full); // or all or complete or whatever we like best
I'm not a big fan of keeping the current dot behavior as the default. The result of the dot operation, which is also called scalar product, should be a scalar.
BTW, the A:B
operation I mentioned is called Frobenius inner product and can also be written as A:B = trace(B^T A)
similar to what @yhmtsai mentioned, except with an additional reduction at the end.
For scale
, add_scaled
the scaling A * diag(alpha)
doesn't make sense for me. Both operations are basically BLAS operations. In the scalar case, scale
is defined as y <- alpha * y
and add_scaled
as y <- alpha * x + y
. As these operations are not standardized for non-scalar alpha
, I would prefer sticking to the ordering of the scalar operations.
All of these operations are heavily motivated by the solver design, and most of the uses of Dense right now are solver-related as multivectors, so I think we need to keep those in mind -- column-wise operations are most common in the solver setting. Related to the concept of dot products, for multivectors you would probably expect something like x'*y, of which we only compute the diagonal entries, since we don't have interaction between different columns in a solver run. I do agree though that we need a better balance between correct naming and practical considerations.
BLAS operation provides the scalar version, but in the scaler sense, the operation is commutative. and it is different from apply if we A->apply(b, x) : A b -> x the be is on the right. but it is same in the adcvance_apply alpha A * x + beta y definitely, we need to provide different behavior for them
As I'm also in favor of keeping the current behavior, this boils down to the question, what should be the default behavior?
Pros:
Cons:
Pros:
Cons
Feel free to add, change pros and cons.
I agree that there is a semantic difference between Dense and Multivector. Currently while we use these interchangeably, we definitely should not do that and there are implications for example, in a distributed setting. I wanted to add on some points which we should think about before adding the functionalities.
Points considered for the meeting on multivector/dense:
Outcomes:
LinOp
. LinOp
's apply
will take multivectors as arguments (and perhaps a separate apply that takes other LinOps, which is somewhat a separate discussion). This also means almost everywhere we use Dense will need to be replaced by MultiVector, a heavy interface break.LinOp LinOp::apply(const LinOp, LinOp)
as well as LinOp LinOp::apply(const MultiVector, MultiVector)
. We mark the former as deprecated (assuming they will not be supported in the future, depending the advanced apply issue).
The current
Dense
matrix can be used simultaneously as a regular dense matrix and as multiple column vectors combined (multivector/MV). In the class documentation, the case of multiple column vectors is even highlighted as a common use case. For most methods, there is no difference in the behavior of a dense matrix and a multivector. However, I think for the following methods some differences exists (impl marks the currently implemented version):apply
scale&add_scaled
diag(alpha)*A
, could also supportalpha
as column vector)dot
sum_ij a_ij * b_ij
, i.e. the dot product of the flat vectorsnorm
Sidenote: The documentation of each method already specifies which behavior is used, but I don't think that is sufficient. I guess most users won't consult the documentation for these methods, which could cause some confusion if the dense behavior is assumed.
IMO, as the name already suggest the
Dense
class should model a dense matrix and the methods should have the expected behavior. If the behavior of a MV is wanted, e.g. for computing the norm of multiple vectors simultaneously, the name of the method should reflect that behavior. Since this change would break interface, it needs to be put on hold until ginkgo 2.0. Nevertheless, I propose the foilowing changes: