Open CarloLucibello opened 1 year ago
It's not possible. If there's a missing row or missing column (which is totally possible per the spec) it is required to be sparse. If a (subgraph) has no edges for that node then we have to return missing
or some other "no-value" sentinel.
If there are no missing entries, then it will return a "dense" GBMatrix. That can then be unpacked or converted by a memcopy to a Matrix.
I'm working on a new library that should support something like this, but it's a ways off. And we still want to support the graph method.
More generally if you're asking about Matrix
outputs specifically, that's a limitation right now that might be overcome in the future but I can't make any guarantees.
The semantics of GBMatrix seems not to be entirely consistent with that of an AbstractArray
. GBMatrix chooses to ignore the fill value in reductions:
julia> x = GBMatrix([1,2], [2, 3], [1,2], fill=100000)
2x3 GraphBLAS int64_t matrix, bitmap by row
2 entries, memory: 264 bytes
(1,2) 1
(2,3) 2
julia> x[1,1]
100000
julia> sum(x)
3
and so we have this weird thing that manually computing the sum with a for loop gives a different value compared to sum(x)
. This doesn't feel very safe.
Yes, so there's a significant issue here I've yet to resolve. The main problem is that fill
isn't a construct in GraphBLAS, it's my own addition to make things more AbstractArray like as much as possible.
This is solvable for reduce
pretty trivially, and I can do that, just counting the number of non-fills and doing a simple multiply is fine.
But then you get to multiplication and I don't have a great answer. Do you think A * B
should account for arbitrary fill value? This gets into the weeds when you use various semirings. If I do take the fill value into account then I will be arbitrarily densifying a significant fraction of the time, pretty much whenever the fill value isn't 0, NoValue/missing, or [-]Inf.
The performance impact of densifying would be immense.
Keep in mind also, the use of this library as a SparseArray is mostly secondary to its use for graph algorithms. The fill
was intended to try and bridge the gap between a SparseMatrixCSC and a GBMatrix as much as possible, but only in a way that doesn't compromise its use for graphs.
fill
was never meant to be something that isn't an identity of something. So it should really be at minimum missing/novalue
, 1
, 0
, Inf
or -Inf
. I suppose I could @warn
if reduce
is used on a collection with the wrong fill value for that monoid?
Note: novalue
will replace missing
in an upcoming version. Essentially the opposite of missing
.
Reductions on
SparseMatrixCSC
return dense arrays. Should the same happen forGBMatrix
? Current behavior is