>10X slower for zeroing out row and column slice

JuliaSparse / SuiteSparseGraphBLAS.jl

Sparse, General Linear Algebra for Graphs!

MIT License

102 stars 16 forks source link

>10X slower for zeroing out row and column slice #109

Open mzy2240 opened 1 year ago

mzy2240 commented 1 year ago

function zero_out(mat, bus_index)
    mat[bus_index, :] .= 0
    mat[:, bus_index] .= 0
    mat[bus_index, bus_index] = -1
    return mat
end

For the above code, GBMatrix is > 13.5 times slower than the default sparse matrix for a 20k * 20k sparse matrix with 0.018% sparsity.

rayegun commented 1 year ago

So what exactly do you want to do here? What you're doing is setting all those indices to be explicitly 0. As in they are now stored.

I probably need better docs and apis for this. You want those to become implicit zeros?

mzy2240 commented 1 year ago

Without changing the sparsity, I want the nonzero entries at this row and column to be ZERO.

mzy2240 commented 1 year ago

When doing this for the default CSC matrix, it is actually only changing those nonzeros to 0-value nonzeros.

rayegun commented 1 year ago

I think I can find a relatively performant solution involving masks. I have to update the package though to fix a bug I found writing the solution.

@DrTimothyAldenDavis what's the fastest way of doing these operations? This is one use-case where the range based masks I've discussed with you before would be useful. We want to do something like apply!(A[bus_index, :], 0, second) here. But constructing that range as a mask could be costly.

DrTimothyAldenDavis commented 1 year ago

Probably the best way to do this would be GrB_assign, with a structural mask that is the same as the matrix you are modifying. So C<C,struct>(:,j) = 0 to zero out a column, or (i,:) to zero out a row.

It looks like this will select my internal kernel called GB_subassign_05, which is a bit more general, doing C(I,J)<M>=scalar, and where the matrix C and matrix M need not be the same. They are the same in this case, which is a very special (and important) case that I often try to exploit.

But it doesn't seem that I exploit it in this case. I'll get a binary search in the inner loop, as a result. It will be decent in performance but I could do better. I have a blizzard of methods for GrB_assign (about 40?). A 41st method, which did C(I,J)<C,struct>=scalar

would speed this up quite a bit.

DrTimothyAldenDavis commented 1 year ago

Note that in the assignnment, the scalar 0 is an explicit zero, not an implicit one. I do have "zombie assignment" methods that delete all entries in a single row or single column, but that's not what you want here.

DrTimothyAldenDavis commented 1 year ago

No need to construct a special mask for this case, if that's what I understand what is meant by the dot equal ( .= ) notation.

rayegun commented 1 year ago

Using the mask as is with assign I didn't think of! That's great thank you.

Dot assignment lowers to subassign. I need to automatically dispatch to assign when I detect the larger mask, I'll release that change shortly.