Closed pdamme closed 2 years ago
Assigned to @akroviakov.
@pdamme Shouldn't the resulting matrix per definition be the following:
res: 0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 3 0 1
0 2 0 0
e.g., res[4, 3] = 1 for k=1, since A[1] = 4 and B[1] = 3?
Also, res
gets filled with k
values, what could be the reason for k
to start at 0? Should the loop explicitly start from 0 in case of already zeroed DenseMatrix
in constructor, should some coordinate in CSRMatrix
be explicitly set to 0?
If I'm not mistaken, my initial example should match the definition... (BTW: A
and B
should be lhs
and rhs
, respectively.) There are three distinct pairs of corresponding elements in lhs
and rhs
: (2, 0), (4, 3), and (5, 1). So these are the positions of non-zero elements in the result. (2, 0) and (5, 1) occur just once, but (4, 3) occurs twice, thus res[2, 0] == 1
, res[4, 3] == 2
, and res[5, 1] == 1
.
Counting starts at zero refers to the positions within a matrix. In DAPHNE, the first row is row 0, while in SystemDS it is row 1. Same for columns.
If you zero the DenseMatrix
on creation, you can simply set()
the individual elements. For CSRMatrix
you should not store zeros explicitly.
I hope that answers your question...
Yes sorry, my mistake, I've misinterpreted the brackets in the definition.
This task is to implement a kernel for DaphneIR's
CTableOp
. The contingency table of two (n x 1) matrices lhs and rhs is a matrix res of shape (max(lhs) + 1, max(rhs) + 1), and res[i, j] = |{ k | A[k] = i and B[k] = j, 0 ≤ k ≤ n-1 }|.Example:
We need:
DenseMatrix
inputs andDenseMatrix
output.DenseMatrix
inputs andCSRMatrix
output.kernels.json
.Note that this operation is inspired by SystemDS's
table()
built-in function (see https://apache.github.io/systemds/site/dml-language-reference). However, its definition is slightly different here, since we start counting rows/columns at zero (not at one).We currently need this kernel to make the PCA DaphneDSL script in
scripts/algorithms/pca.daph
run.